An anonymous commenter expressed a desire to see how one might use SAS to draw a bubble plot with bubbles in three colors, corresponding to a fourth variable in the data set. (x, y, z for bubble size, and the category variable.) In a previous entries we discussed bubble plots and showed how to make the bubble print in two colors depending a fourth

*dichotomous*variable.

The SAS approach to this cannot be extended to fourth variables with many values: we show here an approach to generating this output. The R version below represents a trivial extension of the code demonstrated earlier.

**SAS**

We'll start by making some data-- 20 observations in each of 3 categories.

data testbubbles;

do cat = 1 to 3;

do i = 1 to 20;

abscissa = normal(0);

ordinate = normal(0);

z = uniform(0);

output;

end;

end;

run;

Our approach will be to make an

`annotate`data set using the

`annotate macros`(section 5.2). The

`%slice`macro easily draws filled circles. Check its documentation for full details on the parameters it needs in the on-line help: SAS Products; SAS/GRAPH; The Annotate Facility; Annotate Dictionary. Here we note that the 5th parameter is the radius of the circle, chosen here as an arbitrary function of z that makes pleasingly sized circles. Other parameters reflect color density, arc, and starting angle, which could be used to represent additional variables.

%annomac;

data annobub1;

set testbubbles;

%system(2,2,3);

%slice(abscissa, ordinate, 0, 360, sqrt(3*z), green, ps, 0);

run;

Unfortunately, due to a quirk of the macro facility, I don't think the color can be changed conditionally in the preceding step. Instead, we need a new data step to do this.

data annobub2;

set annobub1;

if cat=2 then color="red";

if cat=3 then color="blue";

run;

Now we're ready to plot. We use the

`symbol`(section 5.2.2) statement to tell

`proc gplot`not to plot the data, add the annotate data set, and suppress the legend, as the default legend will not look correct here. An appropriate legend could be generated with a

`legend`statement.

symbol1 i=none r=3;

proc gplot data=testbubbles;

plot ordinate * abscissa = cat / annotate = annobub2 nolegend;

run;

quit;

The resulting plot is shown above. Improved axes are demonstrated throughout the book and in many previous blog posts.

**R**

The R approach merely requires passing three colors to the

`bg`option in the

`symbols()`function. To mimic SAS, we'll start by defining some data, then generate the vector of colors needed.

cat = rep(c(1, 2, 3), each=20)

abscissa = rnorm(60)

ordinate = rnorm(60)

z = runif(60)

plotcolor = ifelse(cat==1, "green", ifelse(cat==2, "red", "blue"))

The nested calls to the

`ifelse`function (section 1.11.2) allow vectorized conditional tests with more than two possibilities. Another option would be to use a

`for`loop (section 1.11.1) but this would be avoiding one of the strengths of R. In this example, I suppose I could have defined the

`cat`vector with the color values as well, and saved some keystrokes.

With the data generated and the color vector prepared, we need only call the

`symbols()`function.

symbols(ordinate, abscissa, circles=z, inches=1/5, bg=plotcolor)

The resulting plot is shown below.

I would rather use indexing to assign colors vector

plotcolor = c("green","red","blue")[cat]

Nice!

Thx for this. I was trying to find ways to plot bubble plots in R and it was hard to find. Now I know :)

Happy to be here, Anonymous. Check out the linked earlier entries or the documentation for symbols() to see a whole bunch of similar cool things to do.

Nice to see you guys are active again!

The sgplot procedure can also do this easily:

data test;

do i = 1 to 40;

cat = ceil(i/10);

x = normal(0) - cat;

y = x + normal(0);

size = normal(0);

output;

end;

run;

proc sgplot data = test;

bubble x=x y=y size=size / group=cat;

run;

quit;

Thanks for posting the sgplot proc, that is much easier! Do you know how to specify the different colors? There is the FILLATTRS color option, but I can't seem to figure out how to tell SAS to use more than one color.

Let's say in your example you want category1=red, category2=yellow, etc.

data attrmap;

retain id "myid";

length fillcolor $ 10;

input value $ fillcolor $;

datalines;

1 green

2 gray

3 blue

4 red

; run;

proc sgplot data = test dattrmap=attrmap;

bubble x=x y=y size=size / group=cat attrid=myid;

run;

quit;

I've been looking for an answer to the same question for quite a few hours. Thanks a lot, it really helped!

