Discussion:
Random Sampling Without Replacement
(too old to reply)
David Fanning
2010-10-13 15:46:09 UTC
Permalink
Folks,

Has anyone coded up an IDL algorithm to do random
sampling without replacement?

For example, suppose I want to sample values in
my 2D image. I want, say, 100 values that represent
individual pixel locations in the image. How can
I make sure I get 100 unique, but random, locations?

Cheers,

David
--
David Fanning, Ph.D.
Fanning Software Consulting, Inc.
Coyote's Guide to IDL Programming: http://www.dfanning.com/
Sepore ma de ni thui. ("Perhaps thou speakest truth.")
wlandsman
2010-10-13 15:53:00 UTC
Permalink
Post by David Fanning
Has anyone coded up an IDL algorithm to do random
sampling without replacement?
Read the master? http://tinyurl.com/26edmmq
David Fanning
2010-10-13 15:59:44 UTC
Permalink
Post by wlandsman
Read the master? http://tinyurl.com/26edmmq
Ah, my general rule is that I only write articles
on topics I understand. I didn't understand this
then, and I barely understand it now. I'm just
going to have to try harder, I guess. :-)

Thanks!

Cheers,

David
--
David Fanning, Ph.D.
Fanning Software Consulting, Inc.
Coyote's Guide to IDL Programming: http://www.dfanning.com/
Sepore ma de ni thui. ("Perhaps thou speakest truth.")
Michael Galloy
2010-10-13 16:01:26 UTC
Permalink
Post by David Fanning
Has anyone coded up an IDL algorithm to do random
sampling without replacement?
For example, suppose I want to sample values in
my 2D image. I want, say, 100 values that represent
individual pixel locations in the image. How can
I make sure I get 100 unique, but random, locations?
To randomly sample m elements from n elements, I have used this albeit
inefficient technique:

m = 3
n = 100
im = findgen(n) ; input array
arr = randomu(seed, n)
ind = sort(arr)
print, im[ind[0:m-1]] ; m random elements from im

Mike
--
www.michaelgalloy.com
Research Mathematician
Tech-X Corporation
Heinz Stege
2010-10-14 01:14:00 UTC
Permalink
Post by David Fanning
Folks,
Has anyone coded up an IDL algorithm to do random
sampling without replacement?
For example, suppose I want to sample values in
my 2D image. I want, say, 100 values that represent
individual pixel locations in the image. How can
I make sure I get 100 unique, but random, locations?
Cheers,
David
Hi all,

here is another way to do this calculation:

function unique_random,n,m
;
; n := total number of values
; m := number of samples
;
compile_opt defint32,strictarr,strictarrsubs
;
inds=long(randomu(seed,m)*(n-findgen(m)))
;
table=lindgen(n)
for i=0,m-1 do begin
j=inds[i]
inds[i]=table[j]
table[j]=table[n-1-i]
end
;
return,inds
end

For a small number of samples (n=100000, m<50000) it is faster than
Mike's code. And if the number of samples is not very small
(n=100000, m>10000), it is even faster than JD's solution from
http://tinyurl.com/26edmmq.

This is true in spite of the presence of the for-loop. I'm surprised
myself. This algorithm may be a good over-all-solution for IDL.

Heinz
Heinz Stege
2010-10-14 02:59:23 UTC
Permalink
Post by Heinz Stege
function unique_random,n,m
Of course we should add the seed to the parameter list:

function unique_random,n,m,seed

Heinz
David Fanning
2010-10-14 03:19:16 UTC
Permalink
Post by Heinz Stege
function unique_random,n,m,seed
Actually, you should probably put the seed in a common
block, or an awful lot of your "sampling" sequences
are going to look a hell of a lot alike. :-)

Cheers,

David
--
David Fanning, Ph.D.
Fanning Software Consulting, Inc.
Coyote's Guide to IDL Programming: http://www.dfanning.com/
Sepore ma de ni thui. ("Perhaps thou speakest truth.")
Heinz Stege
2010-10-14 10:44:28 UTC
Permalink
Post by David Fanning
Post by Heinz Stege
function unique_random,n,m,seed
Actually, you should probably put the seed in a common
block, or an awful lot of your "sampling" sequences
are going to look a hell of a lot alike. :-)
Cheers,
David
Okay, I see. What I wanted to say is, that one has to take care of the
seed. And it is my preference, to put it into the parameter list.

I am afraid, that I would forget about the today's common block, and
generate a second one within another routine in half a year. :-)

Greetings, Heinz
David Fanning
2010-10-14 12:27:46 UTC
Permalink
Post by Heinz Stege
Okay, I see. What I wanted to say is, that one has to take care of the
seed. And it is my preference, to put it into the parameter list.
I am afraid, that I would forget about the today's common block, and
generate a second one within another routine in half a year. :-)
Yes, half a year later you would probably be fine. However,
if you were doing this in some like of loop, maybe using
a bootstrap process or something, passing in the seed
as a parameter is often problematic. To get a truly
random sequence of numbers, the seed has remain "alive"
between calls to RandomU. Otherwise, you get the same
"random" sequence of numbers coming out of your program.

I think a lot of people don't realize this.

Cheers,

David
--
David Fanning, Ph.D.
Fanning Software Consulting, Inc.
Coyote's Guide to IDL Programming: http://www.dfanning.com/
Sepore ma de ni thui. ("Perhaps thou speakest truth.")
Loading...