Monday, September 17, 2012

Example 10.2: Custom graphic layouts





In example 10.1 we introduced data from a CPAP machine. In brief, it's hard to tell exactly what's being recorded in the data set, but it seems to be related to the pattern of breathing. Measurements are taken five times a second, leading to on the order of 100,000 data points in a typical night. To get a visual sense of what a night's breathing looks like is therefore non-trivial.

Today, we'll make the graphic shown above, which presents an hour of data.

SAS
In SAS, the sgpanel procedure (section 5.1.11) will produce a similar graphic pretty easily. But we need to make a data set with indicators of the hour, and of ten-minute blocks within the hour. This we'll do with the ceil function (section 1.8.4).

data cycles2;
set cycles;
hour = ceil(time_min/60);
tenmin = ceil(time_min/10);
time_in_ten = mod(time_min - 1/300,10);
/* 1/300 adjustment keeps last measure in the correct
10-min block */
run;

title "Hour 4 of pressure";
proc sgpanel data = cycles2;
where hour eq 4;
panelby tenmin / layout=rowlattice rows=6 spacing = 4;
colaxis display=none;
rowaxis display = (nolabel);
series x = time_in_ten y = byte;
run; quit;

The resulting plot is shown below. It would be nicer to omit the labels on the right of each plot, but this does not appear to be an option. It would likely only be possible with a fair amount of effort.




R
In R, we'll use the layout() function to make a 7-row layout-- one for the title and 6 for the 10-minute blocks of time. Before we get there, though, we'll construct a function to fill the time block plots with input data. The function accepts a data vector and plots only 3,000 values from it, choosing the values based on an input hour and 10-minute block within the hour. To ensure an equal y-axis range for each call, we'll also send minimum and maximum values as input to the function. All of this will be fed into plot() with the type="l" option to make a line plot.

plot10 = function(hour, tenmins, miny, maxy, data=cycles){
start = hour*18000 + tenmins* 3000 +1
plot((1:3000)/300, cycles[(start + 1):(start +3000)],
ylim = c(miny,maxy),type="l", xaxs="i", yaxs="i")
}

The documentation for layout() is rather opaque, so we'll review it separately.

oldpar = par(no.readonly = TRUE)
# revert to this later

layout(matrix(1:7), widths=1, heights=c(3,8,8,8,8,8,8), respect=FALSE)

The layout() function divides the plot area into a matrix of cells, some of which will be filled by the next output plots. The first argument says where in the matrix the next N objects will go. All the integers 1...N must appear in the matrix; cells that will be left empty have a 0 instead. Here, we have no empty cells, and only one column, so the "matrix" is really just a vector with 1...7 in order. The widths option specifies the relative widths of the columns-- here we have only one column so any constant will result in the use of the whole width of the output area. Similarly, the heightsoption gives the relative height of the cells. Here the title will get 3/51 of the height, while each 10-minute block will get 8/51. This unequal shape of the plot regions is one reason to prefer layout() to some other ways to plot multiple images on a page. The respect option, when "TRUE" makes the otherwise relative widths and heights conform, so that a unit of height is equal to a unit of width. We also use layout() in example 8.41.

With the layout in hand, we're ready to fill it.

par(xaxt="n", mar = c(.3,2,.3,0) +.05)
# drop the x-axis, change the spacing around the plot
plot(x=1,y=1,type="n",ylim=c(-1,1), xlim=c(-1,1), yaxt="n",bty="n")
# the first (narrow) plot is just empty
hour=3
text(0,0,paste("Hour ", (hour + 1), " of pressure data"), cex=2)
# text to put in the first plot
miny = min(cycles[(hour * 18000 + 1):((hour + 1) * 18000)])
maxy = max(cycles[(hour * 18000 + 1):((hour + 1) * 18000)])
# find min and max across the whole hour, to keep range
# of y-axis constant across the plots
for (x in 0:5) plot10(hour, x, miny, maxy)
# plot the 6 ten-minute blocks
par(oldpar)
# reset the graphics options

The resulting plot is shown at the top of the entry. There's clearly something odd going on around 11-15 minutes into the hour-- this could be a misadjusted mask, or a real problem with the breathing. There's also a period around 58 minutes when it looks like breathing stops. That's what the machine is meant to stop.


An unrelated note about aggregatorsWe love aggregators! Aggregators collect blogs that have similar coverage for the convenience of readers, and for blog authors they offer a way to reach new audiences. SAS and R is aggregated by R-bloggers, PROC-X, and statsblogs with our permission, and by at least 2 other aggregating services which have never contacted us. If you read this on an aggregator that does not credit the blogs it incorporates, please come visit us at SAS and R. We answer comments there and offer direct subscriptions if you like our content. In addition, no one is allowed to profit by this work under our license; if you see advertisements on this page, the aggregator is violating the terms by which we publish our work.

2 comments:

nick said...

These are helpful visualizations. They also make me wonder if the information in them could be condensed more into a heat-map style graph. Sort of in the style of "lasagna plots": http://journals.lww.com/epidem/Fulltext/2010/09000/Lasagna_Plots__A_Saucy_Alternative_to_Spaghetti.15.aspx

How would the oscillations look in a heat map? Could be interesting, although it might make you dizzy.

Ken Kleinman said...

Hi, Nick-- I think my plot will look a lot more like lasagna, if I add more hours to it, than these heat maps. I hadn't seen the lasagna plot recently, if ever. There's a link to the data in example 10.1, if you want to play with it.

What I'd really like to do for analysis would be to subtract out the cyclical pattern and show/interpret mainly deviations from it, but I'm quite ignorant about time-series methods, which I assume would help with that.