Monday, May 10, 2010

Example 7.36: Propensity score stratification

In examples 7.34 and 7.35 we described methods using propensity scores to account for possible confounding factors in an observational study.

In addition to adjusting for the propensity score in a multiple regression and matching on the propensity score, researchers will often stratify by the propensity score, and carry out analyses within each group defined by these scores. With sufficient sample size, we might use 5 or 10 strata, but in this example, we'll use 4.


To facilitate running the code from this entry, we include all of the previous code snippets (the code is also posted on the book website as propensity.R.

ds = read.csv("")
summary(lm(pcs ~ homeless))

form = formula(homeless ~ age + female + i1 + mcs)
glm1 = glm(form, family=binomial)
X = glm1$fitted
tapply(X, homeless, FUN=fivenum)
hist(X[homeless==0], xlim=c(0.2,1))
hist(X[homeless==1], xlim=c(0.2,1))
summary(lm(pcs ~ homeless + X, subset=(X<.8)))
rr = Match(Y=pcs, Tr=homeless, X=X, M=1)
MatchBalance(form, match.out=rr, nboots=10)

We will use the propensity scores (stored in the variable X) as a way of stratifying the dataset into four approximately equally sized groups. Unlike our earlier approach, we will include all subjects (even though with very high propensities) and note that as a result we will observe animbalance in the distribution of homelessness by strata.

breakvals = fivenum(X) # section 2.1.4
strat = cut(X, breaks=breakvals,
labels=c('bot quart', '2nd quart', '3rd quart', 'top quart'),

Next we set some parameters needed to tweak the graphic that will be created.

topval = 82
botval = 15
eps = 3

Next we create a set of boxplots for the PCS scores of the homeless and non-homeless in each of the four strata plus overall. In addition, we annotate the plot with the labels as well as the sample size in each group.

boxplot(pcs[homeless==0 & strat=='bot quart'],
pcs[homeless==1 & strat=='bot quart'],
pcs[homeless==0 & strat=='2nd quart'],
pcs[homeless==1 & strat=='2nd quart'],
pcs[homeless==0 & strat=='3rd quart'],
pcs[homeless==1 & strat=='3rd quart'],
pcs[homeless==0 & strat=='top quart'],
pcs[homeless==1 & strat=='top quart'],
pcs[homeless==0], pcs[homeless==1],
ylim=c(botval,topval), xaxt="n", ylab="PCS score")
abline(v=8.5, lwd=2)
text(1, topval, "not\nhomeless", cex=cexval)
text(2, topval, "homeless", cex=cexval)
text(3, topval, "not\nhomeless", cex=cexval)
text(4, topval, "homeless", cex=cexval)
text(5, topval, "not\nhomeless", cex=cexval)
text(6, topval, "homeless", cex=cexval)
text(7, topval, "not\nhomeless", cex=cexval)
text(8, topval, "homeless", cex=cexval)
text(9, topval, "not\nhomeless", cex=cexval)
text(10, topval, "homeless", cex=cexval)

text(1.5, botval+eps, "bot quart")
text(3.5, botval+eps, "2nd quart")
text(5.5, botval+eps, "3rd quart")
text(7.5, botval+eps, "top quart")
text(9.5, botval+eps, "overall")

text(1, topval-eps, paste("n=",sum(homeless==0 & strat=='bot quart', na.rm=TRUE),sep=""), cex=cexval)
text(2, topval-eps, paste("n=",sum(homeless==1 & strat=='bot quart', na.rm=TRUE),sep=""), cex=cexval)
text(3, topval-eps, paste("n=",sum(homeless==0 & strat=='2nd quart', na.rm=TRUE),sep=""), cex=cexval)
text(4, topval-eps, paste("n=",sum(homeless==1 & strat=='2nd quart', na.rm=TRUE),sep=""), cex=cexval)
text(5, topval-eps, paste("n=",sum(homeless==0 & strat=='3rd quart', na.rm=TRUE),sep=""), cex=cexval)
text(6, topval-eps, paste("n=",sum(homeless==1 & strat=='3rd quart', na.rm=TRUE),sep=""), cex=cexval)
text(7, topval-eps, paste("n=",sum(homeless==0 & strat=='top quart', na.rm=TRUE),sep=""), cex=cexval)
text(8, topval-eps, paste("n=",sum(homeless==1 & strat=='top quart', na.rm=TRUE),sep=""), cex=cexval)
text(9, topval-eps, paste("n=",sum(homeless==0, na.rm=TRUE),sep=""), cex=cexval)
text(10, topval-eps, paste("n=",sum(homeless==1, na.rm=TRUE),sep=""), cex=cexval)

We note that the difference between the median PCS scores is smaller within each of the quartiles of the propensity scores than the difference between medians overall. In addition, there is some indication that the difference increases as a function of the propensity score. Finally, note that the proportion of subjects in the two groups veries wildly by strata, swinging from 1/3 to 2/3 being homeless.

We can fit a t-test within each stratum, to assess whether there are statistically significant differences between the PCS scores for the homeless and non-homeless subjects. Here we use the by() operator (which takes three arguments: a dataset, a variable to use to specify groups, and a function to be called with the subset defined for that level) combined with the with() function to carry out the test for the appropriate subset, returning the p-value each time. This requires creation of a dataframe with the appropriate quantities.

> stratdf = data.frame(pcs, homeless, strat)
> out = by(stratdf, strat,
function(mydataframe) {with(mydataframe,
t.test(pcs[homeless==0], pcs[homeless==1]))
> out
strat: bot quart

Welch Two Sample t-test

data: pcs[homeless == 0] and pcs[homeless == 1]
t = 0.806, df = 58.564, p-value = 0.4235
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
-2.65544 6.23678
sample estimates:
mean of x mean of y
49.88714 48.09647

strat: 2nd quart

Welch Two Sample t-test

data: pcs[homeless == 0] and pcs[homeless == 1]
t = -0.1011, df = 101.083, p-value = 0.9197
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
-4.212064 3.803686
sample estimates:
mean of x mean of y
49.37089 49.57508

strat: 3rd quart

Welch Two Sample t-test

data: pcs[homeless == 0] and pcs[homeless == 1]
t = 0.823, df = 110.918, p-value = 0.4123
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
-2.357005 5.705745
sample estimates:
mean of x mean of y
49.44396 47.76959

strat: top quart

Welch Two Sample t-test

data: pcs[homeless == 0] and pcs[homeless == 1]
t = 0.9222, df = 74.798, p-value = 0.3594
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
-2.304285 6.276212
sample estimates:
mean of x mean of y
46.07584 44.08988

The model results show that while the mean PCS is slightly lower among homeless subjects in 3 of the 4 propensity strata, the standard error of the difference is so large that there's little credible evidence for a health cost ascribable to homelessness.


We begin by getting the propensity scores.

proc logistic data="c:\book\help" desc;
model homeless = age female i1 mcs;
output out=propen pred=propensity;

Next, we find the quartiles of the propensity scores using proc means (section 2.1.1) saving the results in a new data set. The new data set has one observation with the three quartiles.

proc means q1 median q3 data=propen;
var propensity;
output out = propenquart q1=q1 median=median q3=q3;

We want to generate a categorical variable to indicate quartile membership. This requires having the quartile values from the propenquart data set available for each row of the data set with the propensities. To do this, we'll use the point option to the set statement. See this note for more details on this approach.

data propen2;
set propen;
setme = 1;
set propenquart point=setme;
if propensity lt q1 then propensity_quartile = "1st quart";
else if propensity lt median then propensity_quartile = "2nd quart";
else if propensity lt q3 then propensity_quartile = "3rd quart";
else if propensity lt 0.8 then propensity_quartile = "4th quart";
else delete;
if homeless = 1 then homeless_status = "Yes";
else if homeless = 0 then homeless_status = "No";

Next, we'll use proc boxplot (section 5.1.7) to make a plot similar to that made in R. The SAS plot does not show the boxes ignoring strata-- these would be quite a hassle, most likely. The boxplot procedure is finicky, and seems to work best for parallel boxplots when the group variables are character variables, and sorting is often necessary.

proc sort data=propen2; by propensity_quartile homeless_status; run;

proc boxplot data=propen2;
plot pcs * homeless_status (propensity_quartile);
insetgroup n / pos=bottom;

Conclusions are noted above.

Finally, we perform the analysis comparing PCS in the homeless and non-homeless groups within each strata, using the ODS system to reduce output.

ods select none;
ods output parameterestimates=propstrata;
proc glm data=propen2;
by propensity_quartile;
model pcs=homeless;
ods select all;

Here are the final results. Small differences from the R results are due to the fact that the defaults for specification of quartiles differ between the two packages (note that for R, there are 79 subjects in the non-homeless bottom quartile, while SAS included 78 non-homeless subjects in that group).

proc print data=propstrata;
where parameter="HOMELESS";

Obs quartile Estimate StdErr Probt

2 1st quart -2.00721337 2.10622538 0.3427
4 2nd quart 0.44727354 2.11652411 0.8330
6 3rd quart -1.55746908 2.03369752 0.4454
8 4th quart -2.00910422 2.06426071 0.3325


Anonymous said...

Would it be possible to have a significant p-value in only 1 out of the 4 quartiles? If so, what would that mean?

Anonymous said...

Definitely possible. Before interpreting, I would do a joint test across strata to see whether the effect really differs between strata. If so, then I would say that, for example, there appears to be an effect, but only among those people most likely to have the exposure. IOW, if you're really likely to have the exposure, having the exposure is associated with the outcome. It's a form of interaction.

Anonymous said...

Thanks for the explanation! Would you have any examples in SAS or R to perform the joint test? It seems another issue with this test is choosing the right covariates...?

Ken Kleinman said...

Just fit an interaction between strata and exposure-- the joint test against the null of equal effects in all strata would be from /type3 in SAS or aov() in R.

Anonymous said...

Thanks for explanation!

It is possible to estimate separate regression models for each stratum?
to improve the precision of estimates? Anyone have some references?

Anonymous said...

Is it possible to use both propensity score information and post-treatment predictors in an outcome model? I haven't seen an example of how to do this in SAS or R.

In my data I have, among other variables:

migrated (binary outcome- yes or no)
military service (binary treatment- yes or no)
age (predictive of outcome and treatment- continuous)
nativity (predictive of outcome and treatment- categorical)
postwar_occupation (occurs after treatment- categorical)

Can anyone provide some guidance on how to build the required models to best account for selection into the treatment group and for the effect of postwar occupation?

Thank you.