**SAS**

We'll create the data as a summary, rather than for every line of data. Then we can use the "events/trials" syntax (section 4.1.1) that both

`proc logistic`and

`proc genmod`accept. This is another way to reduce the size of data sets (along with the

`weight`option mentioned previously) but is less generally useful. The the

`exact`statement in

`proc logistic`will fit the exact logistic regression and generate a p-value. The

`estimate`option is required to display estimated log odds ratio.

data exact;

x=0; count=0; n=100; output;

x=1; count=5; n=100; output;

run;

proc logistic data=exact;

model count/n = x;

exact x / estimate;

run;

This generates the following output:

Exact Parameter Estimates

Standard 95% Confidence

Parameter Estimate Error Limits p-Value

x 1.9414* . -0.0677 Infinity 0.0594

NOTE: * indicates a median unbiased estimate.

**R**

In R we use the

`elrm()`function in the

`elrm`package to approximate exact logistic regression, as described in this paper by the package's authors. The function requires a special formula object with syntax identical to the SAS events/trials syntax. (Note that the function does not behave as expected when identical observations with trials=1 are submitted. Thus data should be collapsed into unique combinations of predictors before using the function.) In addition, it requires its data to be included in a data frame. We'll construct the data frame in one function call to

`data.frame()`.

elrmdata = data.frame(count=c(0,5), x=c(0,1), n=c(100,100))

library(elrm)

resexact = elrm(count/n ~ x, interest = ~x, iter=22000,

burnIn=2000, data=elrmdata, r=2)

summary(resexact)

producing the following result:

Call:

[[1]]

elrm(formula = count/n ~ x, interest = ~x, r = 2, iter = 22000,

dataset = elrmdata, burnIn = 2000)

Results:

estimate p-value p-value_se mc_size

x 2.0225 0.02635 0.0011 20000

95% Confidence Intervals for Parameters

lower upper

x -0.02065572 Inf

Differences between the SAS and R results most likely arise from the fact that the

`elrm()`function is an approximation of the exact approach. The upper limit of infinity seen in the exact SAS analysis and approximate exact

`elrm()`analysis reveals a limitation of this approach relative to the Firth approach seen in example 8.15 and the Bayesian approach we'll examine later.

A final note: if the true Pr(Y=1|X=1) = 0.05, then the true Pr(Y=1|X=0) that results in a log odds ratio of 1.94 is about 0.0075; for a log odds ratio of 2.02, the true probability is about 0.0069.

## 3 comments:

Thank you very much for pointing out that data need to be collapsed for elrm to run. That solved the problem I was having running my model. You made my day : )

You're welcome! It took a bunch of messing around to figure it out myself, and I'm glad you were able to benefit from it.

Very good blog. But I want to ask where is the article that you mentionned at the begining. Than you.

Post a Comment