In SAS, we ﬁnd the probability that X=x using differences in the CDF calculated via the

`cdf`function (1.10.1). We loop through the various binomial probabilities and values of x using the

`do ... end`structure (section 1.11.1). Finally, we store the probabilities in implicitly named variables via an

`array`statement (section 1.11.5).

**SAS**

data table (drop=j);

array probs [11] prob0 prob1 - prob10;

do p = .81 to .99 by .03;

do j = 0 to 10;

if j eq 1 then probs[j+1] = cdf("BINOMIAL", 0, p, 10);

else probs[j+1] = cdf("BINOMIAL", j, p, 10)

- cdf("BINOMIAL", j-1, p, 10);

end;

output;

end;

run;

The results are printed with two decimal places using the

`format`statement (section 1.2.4). The

`options`statement (section A.4) uses the

`ls`option to narrow the output width to be compatible with Blogger.

options ls=64;

proc print data=table noobs;

var p prob0 prob1 - prob10;

format _numeric_ 3.2;

run;

And the results are:

p

p p p p p p p p p p r

r r r r r r r r r r o

o o o o o o o o o o b

b b b b b b b b b b 1

p 0 1 2 3 4 5 6 7 8 9 0

.81 .00 .00 .00 .00 .00 .02 .08 .19 .30 .29 .12

.84 .00 .00 .00 .00 .00 .01 .05 .15 .29 .33 .17

.87 .00 .00 .00 .00 .00 .00 .03 .10 .25 .37 .25

.90 .00 .00 .00 .00 .00 .00 .01 .06 .19 .39 .35

.93 .00 .00 .00 .00 .00 .00 .00 .02 .12 .36 .48

.96 .00 .00 .00 .00 .00 .00 .00 .01 .05 .28 .66

.99 .00 .00 .00 .00 .00 .00 .00 .00 .00 .09 .90

**R**

In R we start by making a vector of the binomial probabilities, using the

`:`operator (section 1.11.3) to generate a sequence of integers. After creating a matrix (section B.4.4) to hold the table results, we loop (section 1.11.1) through the binomial probabilities, calling the

`dbinom()`function (section 1.1) to ﬁnd the probability that X=x. This calculation is nested within the

`round()`function (section 1.2.5) to reduce the digits displayed. Finally, we glue the vector of binomial probabilities to the results.

p <- .78 + (3 * 1:7)/100

allprobs <- matrix(nrow=length(p), ncol=11)

for (i in 1:length(p)) {

allprobs[i,] <- round(dbinom(0:10, 10, p[i]),2)

}

table <- cbind(p, allprobs)

table

With the result:

p

[1,] 0.81 0 0 0 0 0 0.02 0.08 0.19 0.30 0.29 0.12

[2,] 0.84 0 0 0 0 0 0.01 0.05 0.15 0.29 0.33 0.17

[3,] 0.87 0 0 0 0 0 0.00 0.03 0.10 0.25 0.37 0.25

[4,] 0.90 0 0 0 0 0 0.00 0.01 0.06 0.19 0.39 0.35

[5,] 0.93 0 0 0 0 0 0.00 0.00 0.02 0.12 0.36 0.48

[6,] 0.96 0 0 0 0 0 0.00 0.00 0.01 0.05 0.28 0.66

[7,] 0.99 0 0 0 0 0 0.00 0.00 0.00 0.00 0.09 0.90

As with the example of jittered scatterplots for dichotomous y by continuous x, (Example 7.3, Example 7.4, and Example 7.5) we will revisit this example for improvement in later entries.

## 2 comments:

In R one would write:

probs <- seq(.81, .99, .03)

round(t(sapply(probs, dbinom, x=0:10, size=10)), 2)

Very nice, Andi. Very R.

What Andi has done is to feed dbinom() as the function for which sapply() will repeat over the values of the probs vector (along with a more elegant way to generate the probs. He's also taking advantage of the ability of sapply() to accept additional arguments to the function sapply() is working on, added after the name of the function.

My philosophy is that the important thing is to get the job done, and that style is less important than results. That said, the R code I wrote in the example is pretty ugly and doesn't take full advantage of the tools R offers. I hope if I were to write it today I might come up with something closer to what Andi wrote.

Post a Comment