Tuesday, April 19, 2011

Example 8.35: Grab true (not pseudo) random numbers; passing API URLs to functions or macros

Usually, we're content to use a pseudo-random number generator. But sometimes we may want numbers that are actually random-- an example might be for randomizing treatment status in a randomized controlled trial.

The site Random.org provides truly random numbers based on radio static. For long simulations, its quota system may prevent its use. But for small to moderate needs, it can be used to provide truly random numbers. In addition, you can purchase larger quotas if need be.

The site provides APIs for several types of information. We'll write functions to use these to pull vectors of uniform (0,1) random numbers (of 10^(-9) precision) and to check the quota. To generate random variates from other distributions, you can use the inverse probability integral transform (section 1.10.8).

The coding challenge here comes in integrating quotation marks and special characters with function and macro calls.

SAS
In SAS, the challenging bit is to pass the desired number of random numbers off to the API, though the macro system. This is hard because the API includes the special characters ?, ", and especially &. The ampersand is used by the macro system to denote the start of a macro variable, and is used in APIs to indicate that an additional parameter follows.

To avoid processing these characters as part of the macro syntax, we have to enclose them within the macro quoting function %nrstr. We use this approach twice, for the fixed pieces of the API, and between them insert the macro variable that contains the number of random numbers desired. Also note that the sequence %" is used to produce the quotation mark. Then, to unmask the resulting character string and use it as intended, we %unquote it. Note that the line break shown in the filename statement must be removed for the code to work.

Finally, we read data from the URL (section 1.1.6) and transform the data to take values between 0 and 1.

%macro rands (outds=ds, nrands=);
filename randsite url %unquote(%nrstr(%"http://www.random.org/integers/?num=)
&nrands%nrstr(&min=0&max=1000000000&col=1&base=10&format=plain&rnd=new%"));
proc import datafile=randsite out = &outds dbms = dlm replace;
getnames = no;
run;

data &outds;
set &outds;
var1 = var1 / 1000000000;
run;
%mend rands;

/* an example macro call */
%rands(nrands=25, outds=myrs);

The companion macro to find the quota is slightly simpler, since we don't need to insert the number of random numbers in the middle of the URL. Here, we show the quota in the SAS log; the file print syntax, shown in Example 8.34, can be used to send it to the output instead.

%macro quotacheck;
filename randsite url %unquote(%nrstr(%"http://www.random.org/quota/?format=plain%"));
proc import datafile=randsite out = __qc dbms = dlm replace;
getnames = no;
run;

data _null_;
set __qc;
put "Remaining quota is " var1 "bytes";
run;
%mend quotacheck;

/* an example macro call */
%quotacheck;


R

Two R functions are shown below. While the problem isn't as difficult as in SAS, it is necessary to enclose the character string for the URL in the as.character() function (section 1.4.1).

truerand = function(numrand) {
read.table(as.character(paste("http://www.random.org/integers/?num=",
numrand, "&min=0&max=1000000000&col=1&base=10&format=plain&rnd=new",
sep="")))/1000000000
}

quotacheck = function() {
line = as.numeric(readLines("http://www.random.org/quota/?format=plain"))
return(line)
}

8 comments:

Dirk Eddelbuettel said...

In fact, random.org is so useful that I wrote an entire (albeit small) package (with two nice vignettes) which has been on CRAN for years under the 'random' name. Take a look, suggestion and patches welcome!

Ken Kleinman said...

Thanks for the pointer, Dirk! And thanks for all of your contributions to R.

fmark said...

Its worth noting that on OS X and Linux you can get true random numbers simply by reading /dev/random (as opposed to pseudo-random numbers at /dev/urandom).

Ken Kleinman said...

That's really interesting. There's some question about whether those are truly random or not. My intuition is that I doubt it can be truly random, since it's ultimately based on an algorithm, but it might be close enough for most uses.

Both /dev/random and /dev/urandom are referred to as "pseudo-random number generators" in various places I looked. This paper http://www.pinkas.net/PAPERS/gpr06.pdf (referred to in the wikipedia entry on /dev/random) appears to be the best description.

David said...

I don't think I understand why pseudo-random isn't good enough for the purpose in mind, or what purposes it wouldn't be good enough for.

Ken Kleinman said...

Hi David-- This page (http://www.random.org/randomness/) has a nice discussion comparing true and pseudo random numbers.

For most purposes, I think they're indistinguishable-- if you have a good pseudo-RNG. It's a question about whether you care whether the numbers are actually deterministic and periodic, or truly random. But perhaps someone else has a better understanding?

David said...

Interesting comments there on bad pseudo-random number generators.

Richard Thornton said...

Interesting treatise on random number generators in "Numerical Recipes".