It's straightforward to calculate these quantities in SAS and R. We'll demonstrate with data from the HELP study, modeling PCS as a function of MCS and homelessness among female subjects.
SAS
In SAS, standardized coefficients are available as the stb option for the model statement in proc reg.
proc reg data="c:\book\help";
where female eq 1;
model pcs = mcs homeless / stb;
run;
The REG Procedure
Model: MODEL1
Dependent Variable: PCS
Parameter Estimates
Parameter Standard
Variable DF Estimate Error t Value Pr > |t|
Intercept 1 39.62619 2.49830 15.86 <.0001
MCS 1 0.21945 0.07644 2.87 0.0050
HOMELESS 1 -2.56907 1.95079 -1.32 0.1908
Parameter Estimates
Standardized
Variable DF Estimate
Intercept 1 0
MCS 1 0.26919
HOMELESS 1 -0.12348
R
In R we demonstrate the use of the lm.beta() function in the QuantPsyc package (due to Thomas D. Fletcher of State Farm). The function is short and sweet, and takes a linear model object as argument:
>lm.beta
function (MOD)
{
b <- summary(MOD)$coef[-1, 1]
sx <- sd(MOD$model[-1])
sy <- sd(MOD$model[1])
beta <- b * sx/sy
return(beta)
}
Here we apply the function to data from the HELP study.
ds = read.csv("http://www.math.smith.edu/r/data/help.csv")
female = subset(ds, female==1)
lm1 = lm(pcs ~ mcs + homeless, data=female)
The results, in terms of unstandardized regression parameters are the same as in SAS:
> summary(lm1)
Call:
lm(formula = pcs ~ mcs + homeless, data = female)
Residuals:
Min 1Q Median 3Q Max
-28.163 -5.821 -1.017 6.775 29.979
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 39.62619 2.49830 15.861 < 2e-16 ***
mcs 0.21945 0.07644 2.871 0.00496 **
homeless -2.56907 1.95079 -1.317 0.19075
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 9.761 on 104 degrees of freedom
Multiple R-squared: 0.0862, Adjusted R-squared: 0.06862
F-statistic: 4.905 on 2 and 104 DF, p-value: 0.009212
To generate the standardized parameter estimates, we use the lm.beta() function.
library(QuantPsyc)
lm.beta(lm1)
This generates the following output:
mcs homeless
0.2691888 -0.1234776
A change in 1 standard deviation of MCS has more than twice the impact on PCS than a 1 standard deviation change in the HOMELESS variable. This example points up another potential weakness of standardized regression coefficients, however, in that the homeless variable can take on values of 0 or 1, and a 1 standard deviation change is hard to interpret.
6 comments:
Regarding the interpretation problem at the end, Andrew Gelman makes a compelling argument for standardizing variables by 2 standard deviations so that the variance is similar to a binary variable (provided p is not too far from 0.5):
http://onlinelibrary.wiley.com/doi/10.1002/sim.3107/abstract
The arm package implements a standardize() function that appears to work similarly to lm.beta.
I think it would make more sense to only standardize the continuous ones-- 2sd makes sense for them. I would leave the categorical covars as is, and also would not touch the outcome.
Just curious, what is the rationale/support for the statement: "such an assessment ignores the confidence limits associated with each pairwise association"? Cheers!
May I ask, how to get 95% confidence interval from standardized coefficients obtained from linear regression?
You can run "Make.Z()" in the QuantPsyc package to convert your data (then lm() would do this for you automatically).
I have a question that is and R question and a statistical question:
I am analysing sales of a retailer. These sales are related to some vars: var1, var2, var3.., varN
Most of the vars are continuos.
I want to analyze the relationship between sales and the vars. I have made a linear regression with R:
rg<-lm(sales ~ var1 + var2 + var3 + var4, data=sales_2017)
summary(rg)
Now I want to know which is the most important variable in sales, and to know the percent of importance of each var. I am doing this (caret package):
varImp(rg, scale = FALSE)
rsimp <- varImp(rg, scale = FALSE)
plot(rsimp)
Is this a good method to obtain variables importance??, is good way in R?
Thanks in advance. Any advice will be greatly apreciated.
Juan
Post a Comment