As an example, we consider the HELP data used extensively for examples in our book. Does homelessness affect physical health, as measured by the

`PCS`score from the SF-36?

First, we consider modeling this relationship directly. This analysis only answers the question of whether homelessness is associated with poorer physical health.

Then we create a propensity score by estimating a logistic regression to predict homelessness using age, gender, number of drinks, and mental health composite score. Finally, we include the propensity score in the model predicting

`PCS`from homelessness. If we accept that these propensity predictors fully account for the probability of homelessness, and there is an association between homelessness and

`PCS`in the model adjusting for propensity, and the directionality of the association flows from homelessness to PCS, then we can conclude that homelessness causes differences in

`PCS`.

We note here that this conclusion relies on other untestable assumptions, including linearity in the relationship between the propensity and

`PCS`. Many users of propensity scores prefer to fit models within strata of the propensity score, or to match on propensity score, rather than use the regression adjustment we present in this entry. In a future entry we'll demonstrate the use of matching.

In a departure from our usual practice, we show only pieces of the output below.

**SAS**

We being by reading in the data and fitting the model. This is effectively a t-test (section 2.4.1), but we use

`proc glm`to more easily compare with the adjusted results.

proc glm data="c:\book\help";

model pcs = homeless/solution;

run;

Standard

Parameter Estimate Error t Value Pr > |t|

Intercept 49.00082904 0.68801845 71.22 <.0001

HOMELESS -2.06404896 1.01292210 -2.04 0.0422

It would appear that homeless patients are in worse health than the others.

We next use

`proc logistic`to estimate the propensity to homelessness, using the

`output`statement to save the predicted probabilities. We omit the output here; it could be excluded in practice using the

`ODS exclude all`statement.

proc logistic data="c:\book\help" desc;

model homeless = age female i1 mcs;

output out=propen pred=propensity;

run;

It's important to make sure that there is a reasonable amount of overlap in the propensity scores between the two groups. Otherwise, we'd be extrapolating outside the range of the data when we adjust.

proc means data=propen;

class homeless;

var propensity;

run;

N

HOMELESS Obs Mean Minimum Maximum

---------------------------------------------------------------

0 244 0.4296704 0.2136791 0.7876000

1 209 0.4983750 0.2635031 0.9642827

---------------------------------------------------------------

The mean propensity to homelessness is larger in the homeless group. If this were not the case, we might be concerned the the logistic model is too poor a predictor of homelessness to generate an effective propensity score. However, the maximum propensity among the homeless is 20% larger than the largest propensity in the non-homeless group. This suggests that a further review of the propensities would be wise. To check them, we'll generate histograms for each group using the

`proc univariate`(section 5.1.4).

proc univariate data=propen;

class homeless;

var propensity;

histogram propensity;

run;

The resulting histograms suggest a some risk of extrapolation. In our model, we'll remove subjects with propensities greater than 0.8.

proc glm data=propen;

where propensity lt .8;

model pcs = homeless propensity/solution;

run;

Standard

Parameter Estimate Error t Value Pr > |t|

Intercept 54.19944991 1.98264608 27.34 <.0001

HOMELESS -1.19612942 1.03892589 -1.15 0.2502

propensity -12.09909082 4.33385405 -2.79 0.0055

After the adjustment, we see a much smaller difference in the physical health of homeless and non-homeless subjects, and find no significant evidence of an association.

**R**

ds = read.csv("http://www.math.smith.edu/r/data/help.csv")

attach(ds)

summary(lm(pcs ~ homeless))

Coefficients:

Estimate Std. Error t value Pr(>|t|)

(Intercept) 49.001 0.688 71.220 <2e-16 ***

homeless -2.064 1.013 -2.038 0.0422 *

We use the

`glm()`function to fit the logistic

regression model (section 4.1.1). A

`formula`object is used to specify the model. The predicted probabilities that we'll use as propensity scores are in the

`fitted`element of the output object.

form = formula(homeless ~ age + female + i1 + mcs)

glm1 = glm(form, family=binomial)

X = glm1$fitted

As in the SAS development, we check the resulting values. Here we use the

`fivenum()`function (section 2.1.4) with the

`tapply()`function (section 2.1.2) to get the results for each level of homelessness.

> tapply(X,homeless, FUN=fivenum)

$`0`

398 97 378 69 438

0.2136787 0.3464170 0.4040223 0.4984242 0.7876015

$`1`

16 18 262 293 286

0.2635026 0.4002759 0.4739152 0.5768015 0.9642833

Finding the same troubling evidence of non-overlap, we fit a histogram for each group. We do this manually, setting up two output areas with the

`par()`function (section 5.3.6) and conditioning to use data from each homeless value in two calls to the

`hist()`function (section 5.1.4).

par(mfrow=c(2,1))

hist(X[homeless==0], xlim=c(0.2,1))

hist(X[homeless==1], xlim=c(0.2,1))

As noted above, we'll exclude subjects with propensity greater than 0.8. This is done with the

`subset`option to the

`lm()`function (as in section 3.7.4).

summary(lm(pcs ~ homeless + X,subset=(X < .8)))

Coefficients:

Estimate Std. Error t value Pr(>|t|)

(Intercept) 54.199 1.983 27.337 < 2e-16 ***

homeless -1.196 1.039 -1.151 0.25023

X -12.099 4.334 -2.792 0.00547 **

## 8 comments:

Keep 'em commin'!

Really interesting example. Would it be fair to say that a similar result can be reached by directly including age, gender, number of drinks, and mental health into the pcs model?

While the propensity approach makes it possible to quickly remove "atypical cases" between the 2 groups is there other reasons NOT to use the more direct approach I suggested?

Hello, Olivier, and thanks for reading. I'm not a causal inference specialist, but I can think of two reasons offhand why the propensity approach might be preferred over simple regression adjustment. First, with more covariates, you might use up a lot of degrees of freedom by doing the regression adjustment. Second, there might be predictors of treatment that would be inappropriate to include as covariates in the outcome model. I guess those could also be related to each other. And, while it doesn't answer your question, a positive reason to make propensity scores is to use the matching approach (as we'll do in next week's entry).

Why didn't you use the R-module "optmatch" for propensity score matching?

It gives you the propensity score as well as several matching algorithms for PSM-analyses?

Thanks for the pointer to "optmatch": more details can be found at http://www.stat.lsa.umich.edu/~bbh/optmatch.html or at CRAN. The package solves matching problems by translating them into minimum-cost flow problems, which are in turn solved optimally by the RELAX-IV codes of Bertsekas and Tseng.

Elizabeth Stuart's excellent paper entitled "Matching methods for causal inference: a review and a look forward" (Statistical Science, February 2010 25(1):1-21) includes a list of matching software for R (cem, Matching, MatchIt, optmatch, PSAgraphics, rbounds, twang) as well as SAS (gmatch, vmatch, Weighting, Greedy and Mahalanobis calipers).

Liz has a website with this info (and more) at:

http://www.biostat.jhsph.edu/~estuart/propensityscoresoftware.html

how do we explain low propenisty scores for the treatment group (e.g .1-.4), does it mean that the sample is well dispersed regarding the covariaties we used?

I compared treatments and controls in different countries, using different covariates, and in some countries predicted propensity scores ranged bet. .1-.4 while in others it ranged bet. .1-.7 how do we explain this?

Post a Comment