Showing posts with label regression adjustment. Show all posts
Showing posts with label regression adjustment. Show all posts

Monday, May 3, 2010

Example 7.35: Propensity score matching

As discussed in example 7.34, it's sometimes preferable to match on propensity scores, rather than adjust for them as a covariate.

SAS

We use a suite of macros written by Jon Kosanke and Erik Bergstralh at the Mayo Clinic. The dist macro calculates the pairwise distances between observations, while the vmatch macro makes matches based on the distances, finding the closest set of matches while minimizing the overall distance. The latter macro uses the nobs macro. The original macros are available for download from the Mayo Clinic Division of Biomedical statistics and Informatics. Slightly enhanced versions of the nobs and vmatch macros are available from the code section of the book website. The enhanced versions are used to generate the results below.

We previously created the propen data set containing the propensity score variable. In that entry we also noted a lack of overlap in the propensity distributions, and decided to drop observations with propensity > 0.8.

First we drop observations with propensity greater than 0.8 using the subsetting if statement (section 1.5.1). This leaves 201 homeless subjects. We then read in the macros using the %include statement (section 2.1.8). Then we run the dist macro, using its option to call the vmatch macro.


data prop2; set propen;
if propensity lt .8;
run;

%include "C:\ken\sasmacros\vmatch.sas";
%include "C:\ken\sasmacros\dist.sas";
%include "C:\ken\sasmacros\nobs.sas";

%dist(data=prop2, group=homeless, id=id, mvars=propensity,
wts=1, vmatch=Y, a=1, b=1, lilm=201, dmax=0.1,
outm=mp1_b, summatch=n, printm=N, mergeout=mpropen);


The macros are well documented by text included at the top of the macro, as is common with SAS macros that authors share. In the preceding code, the parameter values in the first line are relatively self-explanatory. The wts parameter, which allows multiple matching variables to be weighted differently. The dmax parameter specifies the maximum distance acceptable for a match. We arbitrarily decide that the propensities must be within 0.1 to make a match (our results will differ if other criteria were specified for the matching, as we will see below in the R comparison). The remaining parameters request the matching, ask for one and only one match per case, for all the cases to be matched (if possible), suppress printed output, and name the data set to contain output.

The output data set mpropen is identical to the input data set, with the addition of a new indicator variable, matched. We can compare the distribution of the covariates before and after the matching with the means procedure (section 2.1.1).


title "observed data";
proc means data=propen mean;
class homeless;
var age female i1 mcs;
run;

title "matched observations";
proc means data=mpropen mean;
where matched;
class homeless;
var age female i1 mcs;
run;

observed data
N
HOMELESS Obs Variable Mean
-----------------------------------------------
0 244 AGE 35.0409836
FEMALE 0.2745902
I1 13.5122951
MCS 32.4868303

1 209 AGE 36.3684211
FEMALE 0.1913876
I1 23.0382775
MCS 30.7308549
-----------------------------------------------

matched observations
N
HOMELESS Obs Variable Mean
-----------------------------------------------
0 201 AGE 35.6218905
FEMALE 0.1791045
I1 15.9154229
MCS 31.4815123

1 201 AGE 36.1492537
FEMALE 0.1990050
I1 19.9452736
MCS 30.9176772
-----------------------------------------------


We see that the covariates are much better balanced after matching. Also note the dramatic impact of removing the 8 (4%) cases with very large propensities. This has reduced the mean number of drinks by 15%!

We can now performs the analysis on the matched cases. The two classes of homeless status now have nearly equal distributions of the probability of homelessness.


proc glm data=mpropen;
where matched;
model pcs = homeless / solution;
run;

Standard
Parameter Estimate Error t Value Pr > |t|

Intercept 48.95273471 0.76199823 64.24 <.0001
HOMELESS -1.79386398 1.07762823 -1.66 0.0968


After matching, the effect of homelessness on physical health is attenuated and has a larger p-value.


R

In R, the Matching library provides tools for matching and analysis. The Match() function implements a variety of algorithms for multivariate matching including propensity score, Mahalanobis and inverse variance matching. The function is intended to be used in conjunction with the MatchBalance() function which determines the extent to which covariate balance has been achieved. The function takes the propensity score as an argument, as well as the outcome to be compared and the group indicators.

A wide variety of matching options include matching with or without replacement, bias adjustment, different methods for handling ties, exact and caliper matching. The GenMatch function can be used to automatically find balance via a genetic search algorithm which determines the optimal weight to give each covariate.

An extensive website describes the package and the many variety of options that it supports, and a related paper is forthcoming. Three extended examples are included to help illustrate the mechanics of matching.


library(Matching)
rr = Match(Y=pcs, Tr=homeless, X=X, M=1)


The function returns an object describing the matching.


names(rr)
[1] "est" "se" "est.noadj" "se.standard"
[5] "se.cond" "mdata" "index.treated" "index.control"
[9] "index.dropped" "weights" "orig.nobs" "orig.wnobs"
[13] "orig.treated.nobs" "nobs" "wnobs" "caliper"
[17] "ecaliper" "exact" "ndrops" "ndrops.matches"
[21] "MatchLoopC" "version" "estimand"




The results can be displayed by running summary().
      
summary(rr)

Estimate... -0.80207
AI SE...... 1.4448
T-stat..... -0.55516
p.val...... 0.57878

Original number of observations.............. 453
Original number of treated obs............... 209
Matched number of observations............... 209
Matched number of observations (unweighted). 252


By default, the observations are given equal weight. If all of the observations had a weight of 1 on input, then each matched-pair will have a weight of 1 on output if there are no ties.

We see that the causal estimate of -0.80 in the matched comparison is not statistically significant (p=0.58), which is consistent with the other results that accounted for the confounders (though we note that the specific results depend on the particular options that are selected for the matching).

The MatchBalance() function can be used to describe the distribution
of the predictors (by homeless status) before and after matching (to save space, only the results for age and i1 are displayed). This is helpful to determine if the
matching resulted in similar marginal distributions.




> MatchBalance(form, match.out=rr, nboots=10)
***** (V1) age ***** Before Matching After Matching
mean treatment........ 36.368 36.368
mean control.......... 35.041 36.423
std mean diff......... 16.069 -0.65642

mean raw eQQ diff..... 1.5981 0.94841
med raw eQQ diff..... 1 1
max raw eQQ diff..... 7 10

mean eCDF diff........ 0.037112 0.022581
med eCDF diff........ 0.026365 0.019841
max eCDF diff........ 0.10477 0.083333

var ratio (Tr/Co)..... 1.3290 1.2671
T-test p-value........ 0.070785 0.93902
KS Bootstrap p-value.. < 2.22e-16 0.3
KS Naive p-value...... 0.16881 0.34573
KS Statistic.......... 0.10477 0.083333

***** (V3) i1 ***** Before Matching After Matching
mean treatment........ 23.038 23.038
mean control.......... 13.512 20.939
std mean diff......... 40.582 8.945

mean raw eQQ diff..... 9.6316 2.1071
med raw eQQ diff..... 8 1
max raw eQQ diff..... 73 66

mean eCDF diff........ 0.11853 0.018753
med eCDF diff........ 0.12377 0.011905
max eCDF diff........ 0.20662 0.087302

var ratio (Tr/Co)..... 2.3763 1.3729
T-test p-value........ 7.8894e-07 0.011786
KS Bootstrap p-value.. < 2.22e-16 0.3
KS Naive p-value...... 0.00013379 0.29213
KS Statistic.......... 0.20662 0.087302


More details regarding each of the tests for differences in means or distributions can be found using ?MatchBalance. The results for both of the variables presented above indicate that the distributions are considerably closer
to each other in the matched sample than in the original dataset.

Monday, April 26, 2010

Example 7.34: Propensity scores and causal inference from observational studies

Propensity scores can be used to help make causal interpretation of observational data more plausible, by adjusting for other factors that may responsible for differences between groups. Heuristically, we estimate the probability of exposure, rather than randomize exposure, as we'd ideally prefer to do. The estimated probability of exposure is the propensity score. If our estimation of the propensity score incorporates the reasons why people self-select to exposure status, then two individuals with equal propensity score are equally likely to be exposed, and we can interpret them as being randomly assigned to exposure. This process is not unlike ordinary regression adjustment for potential confounders, but uses fewer degrees of freedom and can incorporate more variables.

As an example, we consider the HELP data used extensively for examples in our book. Does homelessness affect physical health, as measured by the PCS score from the SF-36?

First, we consider modeling this relationship directly. This analysis only answers the question of whether homelessness is associated with poorer physical health.

Then we create a propensity score by estimating a logistic regression to predict homelessness using age, gender, number of drinks, and mental health composite score. Finally, we include the propensity score in the model predicting PCS from homelessness. If we accept that these propensity predictors fully account for the probability of homelessness, and there is an association between homelessness and PCS in the model adjusting for propensity, and the directionality of the association flows from homelessness to PCS, then we can conclude that homelessness causes differences in PCS.

We note here that this conclusion relies on other untestable assumptions, including linearity in the relationship between the propensity and PCS. Many users of propensity scores prefer to fit models within strata of the propensity score, or to match on propensity score, rather than use the regression adjustment we present in this entry. In a future entry we'll demonstrate the use of matching.

In a departure from our usual practice, we show only pieces of the output below.


SAS

We being by reading in the data and fitting the model. This is effectively a t-test (section 2.4.1), but we use proc glm to more easily compare with the adjusted results.


proc glm data="c:\book\help";
model pcs = homeless/solution;
run;

Standard
Parameter Estimate Error t Value Pr > |t|

Intercept 49.00082904 0.68801845 71.22 <.0001
HOMELESS -2.06404896 1.01292210 -2.04 0.0422


It would appear that homeless patients are in worse health than the others.

We next use proc logistic to estimate the propensity to homelessness, using the output statement to save the predicted probabilities. We omit the output here; it could be excluded in practice using the ODS exclude all statement.


proc logistic data="c:\book\help" desc;
model homeless = age female i1 mcs;
output out=propen pred=propensity;
run;


It's important to make sure that there is a reasonable amount of overlap in the propensity scores between the two groups. Otherwise, we'd be extrapolating outside the range of the data when we adjust.


proc means data=propen;
class homeless;
var propensity;
run;
N
HOMELESS Obs Mean Minimum Maximum
---------------------------------------------------------------
0 244 0.4296704 0.2136791 0.7876000

1 209 0.4983750 0.2635031 0.9642827
---------------------------------------------------------------


The mean propensity to homelessness is larger in the homeless group. If this were not the case, we might be concerned the the logistic model is too poor a predictor of homelessness to generate an effective propensity score. However, the maximum propensity among the homeless is 20% larger than the largest propensity in the non-homeless group. This suggests that a further review of the propensities would be wise. To check them, we'll generate histograms for each group using the proc univariate (section 5.1.4).


proc univariate data=propen;
class homeless;
var propensity;
histogram propensity;
run;




The resulting histograms suggest a some risk of extrapolation. In our model, we'll remove subjects with propensities greater than 0.8.


proc glm data=propen;
where propensity lt .8;
model pcs = homeless propensity/solution;
run;
Standard
Parameter Estimate Error t Value Pr > |t|

Intercept 54.19944991 1.98264608 27.34 <.0001
HOMELESS -1.19612942 1.03892589 -1.15 0.2502
propensity -12.09909082 4.33385405 -2.79 0.0055


After the adjustment, we see a much smaller difference in the physical health of homeless and non-homeless subjects, and find no significant evidence of an association.



R


ds = read.csv("http://www.math.smith.edu/r/data/help.csv")
attach(ds)
summary(lm(pcs ~ homeless))

Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 49.001 0.688 71.220 <2e-16 ***
homeless -2.064 1.013 -2.038 0.0422 *


We use the glm() function to fit the logistic
regression model (section 4.1.1). A formula object is used to specify the model. The predicted probabilities that we'll use as propensity scores are in the fitted element of the output object.


form = formula(homeless ~ age + female + i1 + mcs)
glm1 = glm(form, family=binomial)
X = glm1$fitted


As in the SAS development, we check the resulting values. Here we use the fivenum() function (section 2.1.4) with the tapply() function (section 2.1.2) to get the results for each level of homelessness.


> tapply(X,homeless, FUN=fivenum)
$`0`
398 97 378 69 438
0.2136787 0.3464170 0.4040223 0.4984242 0.7876015

$`1`
16 18 262 293 286
0.2635026 0.4002759 0.4739152 0.5768015 0.9642833


Finding the same troubling evidence of non-overlap, we fit a histogram for each group. We do this manually, setting up two output areas with the par() function (section 5.3.6) and conditioning to use data from each homeless value in two calls to the hist() function (section 5.1.4).


par(mfrow=c(2,1))
hist(X[homeless==0], xlim=c(0.2,1))
hist(X[homeless==1], xlim=c(0.2,1))




As noted above, we'll exclude subjects with propensity greater than 0.8. This is done with the subset option to the lm() function (as in section 3.7.4).


summary(lm(pcs ~ homeless + X,subset=(X < .8)))

Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 54.199 1.983 27.337 < 2e-16 ***
homeless -1.196 1.039 -1.151 0.25023
X -12.099 4.334 -2.792 0.00547 **