Monday, February 28, 2011

Plug for RStudio: powerful, free, and easy to use interactive development environment for R


(click for a bigger picture)


As a longtime SAS user, one obstacle for me in using R professionally has been figuring out a process for saving and testing code across several work sessions and integrating code composition and execution. There are a couple of integrated R environments available, including ESS, TINN-R, and others. However, each of these seemed to require a serious investment of time, and I never did get around to using them (nor did Nick, despite several good-faith attempts). Instead I used a clunky system of editing code via a text editor, then copy and pasting or sourcing. This really inhibited my ability to at first learn then efficiently code in R.

Then Nick introduced me to the folks who have created RStudio. They are a small group of wicked smart programmers who know how to help other programmers be more efficient. They've now turned their attention to help statisticians and other R users. RStudio, publicly available as of 2/28/2011, is an open source product that is freely available. Its abilities are extremely broad, and I'm bound to miss something important in the brief description below, but suffice it to say that it's well worth your time to check it out. Neither Nick nor I have any vested interest in recommending it (though he's moved all of his teaching of introductory and intermediate statistics courses to it, along with his collaborative research projects).

RStudio is an integrated development environment for R that includes 1) text editing windows from which code can be submitted to the console and/or saved to the OS, 2) live lists of the objects in your workspace, 3) easily searchable infinite history with ability to insert from the history to the console or a text editing window, 4) tab completion in the console for objects, commands, and help, 5) interface with the OS for access to files, 6) help window with back and forward buttons, 7) package downloading, and 8) support for Sweave to facilitate reproducible analysis. Despite all these capabilities, RStudio is very easy to get started with.

There is also a server version, which you can access over the web if someone installs it and gives you access. If you're not familiar with this idea, it means you can work from most browsers--I was even able to use it on a Kindle. The cloud version saves your workspace from session to session, so you can work in exactly the same way, in exactly the same workspace (with a continuous history and all your objects), on whatever OS/CPU you have in front of you-- Windows, Mac OS, Chrome, Linux. You can switch OS, you can shut your computer down, and RStudio comes up just as you left it. Forgot your laptop? No problem.

The standalone version is an ordinary downloadable program. It uses the existing R binaries on your Mac (OSX 10.5+), Windows (XP/Vista/7), Ubuntu or Fedora Linux machine. The local and server applications have the same interface.

For me, the most useful aspect has been the integrated editor, but each one of the items I listed above has saved me a great deal of time over the past few months. The integrated help alone might be reason enough to adopt it. As a consulting statistician, RStudio is a huge leap forward. It changes R from a important tool which I have to be able to use into a plausible system in which to do all of my work. I really can't overestimate its value to me. Go to http://www.rstudio.org/ to learn more, see screenshots, and download!

5 comments:

Anonymous said...

Thanks for the pointer.

If there is a way to put the source code and the console side by side, I don’t see it. This is a HUGE drawback for those with wide screens or multiple monitors.

Ken Kleinman said...

I also haven't found a way to do that, but I haven't found it irritating. And I use two monitors, one of which is pretty wide (the other is my laptop). I find the layout works very well for me; if I was going to change something, I'd keep the help tab and the plots tab in different windows.

Send a note to RStudio's developers-- they're nice. Odds are pretty high they're considering making it fully configurable or they have some sound reasoning about why they're not planning to do it.

Ken Kleinman said...

The new version of RStudio includes a fully customizable layout. See: http://blog.rstudio.org/2011/04/11/rstudio-beta2/

Matt Jans said...

Great post (and looks like great book, too!...just grabbed a copy from CRC Press site). I love RStudio for all the same reasons. I found your post googling for info on whether I could run SAS code from RStudio. I'm tired of moving between SAS, Stata, and R IDEs, and I'm wondering if there's a way to do all from one. Either one of the program's IDEs (Rstudio would be my pref), or through a separate IDE. I'm not really a programmer (researcher who programs), so I don't know all the IDEs out there. I'm trying to setup a makeshift IDE with TextPad or NotePad++ (already have a workflow for my files that works well, so don't need versioning, etc.).

Any pointers would be very welcome. Thaks!

Ken Kleinman said...

Hi, Matt--

Thanks for that praise, and I hope that the book is helpful to you. Looking back on this post, it's hard to believe it's only 3 and half years since RStudio appeared, given how much it's become a part of my process and how much more it now does.

I'm not aware of simple ways to run SAS from RStudio. It would be possible, though clunky, to use RStudio as a text editor for .sas, .log and .lst files, and to write R functions to submit SAS batch jobs in the OS. I think that would be a bit sad in light of the increased use of graphics in recent versions of SAS, though.

With (a lot) more work, you could probably do a similar thing, but leveraging the RStudio interface and making SAS HTML output go to a file that RSudio could/would display in a quadrant.

Another option would be to work with the proc_r SAS macro described here http://sas-and-r.blogspot.com/2012/01/sas-macro-simplifies-sas-and-r.html. That would mean using the SAS IDE and basically using R only in batch, though. And many people have had trouble with proc_r. I have not used it in some time, myself.

If you find or make a nice solution, please tell us about it!