Showing posts with label SAS. Show all posts
Showing posts with label SAS. Show all posts

Monday, July 26, 2010

Using SAS for Data Management, Statistical Analysis, and Graphics

Our newest book, Using SAS for Data Management, Statistical Analysis and Graphics, will soon be shipping from Amazon, CRC Press, and other fine retailers.



The book complements our SAS and R book, particularly for users less interested in R. It presents an easy way to learn how to perform analytical tasks in SAS, without having to navigate through the extensive, idiosyncratic, and sometimes unwieldy software documentation, and demonstrates useful applications, shortcuts, and tricks. Organized by short, clear descriptive entries, the book covers many common tasks, such as data management, descriptive summaries, inferential procedures, regression analysis, multivariate methods, and the creation of graphics.

Through the extensive indexing, cross-referencing, and worked examples in this text, users can directly find and implement the material they need. The text includes convenient indices organized by topic and SAS syntax, and presents example analyses that employ a single data set from the HELP study to demonstrate the SAS code in action and facilitate exploration. We also provide several case studies of more complex applications. Data sets and code are available for download on the book’s website. Many features of SAS version 9.2 (including new procedures and ODS support) are highlighted.

To book tries to lucidly summarize the aspects of SAS most often used by statistical analysts. We believe that new users of SAS will find the simple approach easy to understand while more sophisticated users will appreciate the invaluable source of task-oriented information.

Note as of August 6, 2010: the book is now shipping from Amazon, albeit with no discount.

Thursday, July 2, 2009

Example 7.4: A prettier jittered scatterplot

The plot in section 7.3 has some problems. At the very least, the jittered values ought to be between 0 and 1, so the smoothed lines fit better with them. Once again we use the data generated in section 7.2 as an example. For both SAS and R, we use conditioning (section 1.11.2) to make the jitter happen within the 0-1 range.

SAS
In SAS, we use axis statements (section 5.3.8) to clean up the axis tick marks and labels.

data lp1;
set test;
jitter = uniform(0) * 0.075;
if ytest eq 1 then yplot = ytest - jitter;
else if ytest eq 0 then yplot = ytest + jitter;
run;

axis1 minor = none label = ("xtest");
axis2 minor = none label = (angle=270 rotate=90 "ytest");
symbol1 i=sm50s v=none c = blue;
symbol2 i=none v=dot h = .2 c = blue;
proc gplot data = lp1;
plot (ytest yplot) * xtest / overlay haxis=axis1 vaxis=axis2;
run;
quit;


And the resulting plot is:




















R


In R, we add a label to the y axis with the ylab option (section 5.3.8). We also modify the smoother to be a little less responsive to the data (by using a wider window, see section 5.2.6).


jittery <- jitter(ytest, amount=.0375)
correction <- ifelse(ytest==0, .0375, -.0375)
jittery <- jittery + correction
plot(xtest, ytest, type="n")
points(xtest, jittery, pch = 20, col = "blue", ylab = "ytest")
lines(lowess(xtest, ytest, f = .4), col = "blue")



And the resulting plot is:
















As with the uglier version shown in example 7.3, the differences between the two plots results from there being different randomly-generated data sets and because we use two different smoothers.

In the next example, we'll show how to make a SAS Macro or an R function to replicate this plot easily.