Monday, August 24, 2009

packages and CRANtastic

Additional functionality in R is added through packages, which consist of libraries of bundled functions, datasets, examples and help files that can be downloaded from CRAN (the Comprehensive R Archive Network). The function install.packages() or the windowing interface under Packages and Data (Mac) or Packages (Windows) are used to download and install packages (see section B.6.1, p. 273).

As of August, 2009, there were 1,907 packages on CRAN, up from 1,705 in March 2009 (see here for the current list). While each of these has met a minimal standard for inclusion, it is important to keep in mind that packages within R are typically created by individuals or small groups, and not endorsed by the R core group. As a result, they do not necessarily undergo the same level of testing and quality assurance that the core R system does.

CRANtastic is a free, open-source web-application that allows users to search for, review and tag CRAN packages. It was created by Hadley Wickham and is currently being developed by Bjørn Mæland. It can help you learn more about a package than it's inclusion on CRAN allows.

As an example, consider the entry for the plyr package. This is a set of tools written by Hadley that solves a common set of problems: you need to break a big problem down into manageable pieces, operate on each pieces and then put all the pieces back together. For example, you might want to fit a model to each spatial location or time point in your study, summarise data by panels or collapse high-dimensional arrays to simpler summary statistics. The CRANtastic entry provides detailed release information, author and maintainer, mentions that (as of August 23, 2009) 7 people have noted that they use it, lists 4 ratings received overall (5 stars), with 3 ratings for documentation (5 stars). A user named eamani provided a review. A search for related packages, dependencies and reverse depends is also included.

While still new, with relatively few users, this website has great potential to help provide some guidance about packages. If it takes off as an active community, this could help provide a map to particularly useful routines to utilize within R.

No comments: