Showing posts with label R packages. Show all posts
Showing posts with label R packages. Show all posts

Thursday, May 26, 2011

another look at CRAN Task Views

We've been impressed with how helpful the CRAN Task Views are in guiding us in R as we wend our way through the huge number of add-on packages (3021 as of May, 2011). These are web pages that are maintained by volunteers with expertise in a specified area. The maintainers provide annotated guidance to routines and packages. This is particularly helpful to track new packages or functionality (along with the R-packages relatively low volume mailing list and Crantastic).

As an example, we consider the Empirical Finance task view, which is maintained by Dirk Eddelbuettel. This includes description of standard regression models, time series, finance, risk management, data and time management as well as books with packages. Of particular help are the related links, including 6 other task views (Econometrics, Multivariate, Optimization, Robust, Social Sciences and Time Series).

Reviewing the Task View can help users to get up to speed in a given area, and we commend the R-core for this creative response to the growth of packages.

As of this week, the following Task Views were available:
Bayesian (Bayesian Inference),
ChemPhys (Chemometrics and Computational Physics),
ClinicalTrials (Clinical Trial Design, Monitoring, and Analysis),
Cluster (Cluster Analysis & Finite Mixture Models),
Distributions (Probability Distributions),
Econometrics (Computational Econometrics),
Environmetrics (Analysis of Ecological and Environmental Data),
ExperimentalDesign (Design of Experiments (DoE) & Analysis of Experimental Data),
Finance (Empirical Finance),
Genetics (Statistical Genetics),
Graphics (Graphic Displays & Dynamic Graphics & Graphic Devices & Visualization),
gR (gRaphical Models in R),
HighPerformanceComputing (High-Performance and Parallel Computing with R),
MachineLearning (Machine Learning & Statistical Learning),
MedicalImaging (Medical Image Analysis),
Multivariate (Multivariate Statistics),
NaturalLanguageProcessing (Natural Language Processing),
OfficialStatistics (Official Statistics & Survey Methodology),
Optimization (Optimization and Mathematical Programming),
Pharmacokinetics (Analysis of Pharmacokinetic Data),
Phylogenetics (Phylogenetics, Especially Comparative Methods),
Psychometrics (Psychometric Models and Methods),
ReproducibleResearch (Reproducible Research),
Robust (Robust Statistical Methods),
SocialSciences (Statistics for the Social Sciences),
Spatial (Analysis of Spatial Data),
Survival (Survival Analysis)
TimeSeries (Time Series Analysis)

Are there other areas where a new task view would be useful? Feel free to comment with your thoughts and suggestions.

Monday, August 24, 2009

packages and CRANtastic

Additional functionality in R is added through packages, which consist of libraries of bundled functions, datasets, examples and help files that can be downloaded from CRAN (the Comprehensive R Archive Network). The function install.packages() or the windowing interface under Packages and Data (Mac) or Packages (Windows) are used to download and install packages (see section B.6.1, p. 273).

As of August, 2009, there were 1,907 packages on CRAN, up from 1,705 in March 2009 (see here for the current list). While each of these has met a minimal standard for inclusion, it is important to keep in mind that packages within R are typically created by individuals or small groups, and not endorsed by the R core group. As a result, they do not necessarily undergo the same level of testing and quality assurance that the core R system does.

CRANtastic is a free, open-source web-application that allows users to search for, review and tag CRAN packages. It was created by Hadley Wickham and is currently being developed by Bjørn Mæland. It can help you learn more about a package than it's inclusion on CRAN allows.

As an example, consider the entry for the plyr package. This is a set of tools written by Hadley that solves a common set of problems: you need to break a big problem down into manageable pieces, operate on each pieces and then put all the pieces back together. For example, you might want to fit a model to each spatial location or time point in your study, summarise data by panels or collapse high-dimensional arrays to simpler summary statistics. The CRANtastic entry provides detailed release information, author and maintainer, mentions that (as of August 23, 2009) 7 people have noted that they use it, lists 4 ratings received overall (5 stars), with 3 ratings for documentation (5 stars). A user named eamani provided a review. A search for related packages, dependencies and reverse depends is also included.

While still new, with relatively few users, this website has great potential to help provide some guidance about packages. If it takes off as an active community, this could help provide a map to particularly useful routines to utilize within R.