prediction - performance - plot

[visualizing classifier performance in R, with only 3 commands]


Demo plot
Performance measures that ROCR knows:

Accuracy, error rate, true positive rate, false positive rate, true negative rate, false negative rate, sensitivity, specificity, recall, positive predictive value, negative predictive value, precision, fallout, miss, phi correlation coefficient, Matthews correlation coefficient, mutual information, chi square statistic, odds ratio, lift value, precision/recall F measure, ROC convex hull, area under the ROC curve, precision/recall break-even point, calibration error, mean cross-entropy, root mean squared error, SAR measure, expected cost, explicit cost.
ROCR features:

ROC curves, precision/recall plots, lift charts, cost curves, custom curves by freely selecting one performance measure for the x axis and one for the y axis, handling of data from cross-validation or bootstrapping, curve averaging (vertically, horizontally, or by threshold), standard error bars, box plots, curves that are color-coded by cutoff, printing threshold values on the curve, tight integration with Rs plotting facilities (making it easy to adjust plots or to combine multiple plots), fully customizable, easy to use (only 3 commands).



Examples

  • Using ROCR's 3 commands to produce a simple ROC plot:
    pred <- prediction(predictions, labels)
    perf <- performance(pred, measure = "tpr", x.measure = "fpr")
    plot(perf, col=rainbow(10))
  • Gallery

About

  • ROCR (with obvious pronounciation) is an R package for evaluating and visualizing classifier performance. It is...
  • ...easy to use: adds only three new commands to R.
  • ...flexible: integrates tightly with R's built-in graphics facilities.
  • ...powerful: Currently, 28 performance measures are implemented, which can be freely combined to form parametric curves such as ROC curves, precision/recall curves, or lift curves. Many options such as curve averaging (for cross-validation or bootstrap), augmenting the averaged curves by standard error bar or boxplots, labeling cutoffs to the curve, or coloring curves according to cutoff.

ROCRs

Citing

  • We have invested a lot of time and effort into this package. You can give us something back by citing the following paper:
    Tobias Sing, Oliver Sander, Niko Beerenwinkel, Thomas Lengauer.
    ROCR: visualizing classifier performance in R.
    Bioinformatics 21(20):3940-3941 (2005).  
    Paper at Bioinformatics

Download

Installation

  • Linux: Simply type "R CMD INSTALL ROCR_1.0-1.tar.gz".
  • Windows: From the pull-down menu, click on "Packages->Install Packages from local zip file", and then select the downloaded file ROCR_1.0-1.zip
  • Note that you need to have R installed on your computer. It is freely available on http://www.r-project.org. If for some reason you should have problems installing the package, the manual "R Installation and Administration" might be helpful. You will also need gplots from the R package bundle gregmisc (available from the Comprehensive R Archive Network).

Getting started

  • Loading ROCR: library(ROCR) (from within R).
  • Short demo: demo(ROCR) (after loading).
  • List of available help pages: help(package=ROCR).

Documentation

  • Reference Manual [PDF]
  • Slide deck for a tutorial talk (feel free to re-use for teaching, but please give appropriate credits and write us an email) [PPT]
  • A few pointers to the literature on classifier evaluation

Studies using and citing ROCR (please notify us of any others!)

  • CH Lemon, DV Smith (2006) The Journal of Neuroscience: Influence of response variability on the coding performance of central gustatory neurons.
  • S Hartley, R Harris, PJ Lester (2006) Ecology Letters: Quantifying uncertainty in the potential distribution of an invasive species: climate and the Argentine ant.
  • SJ Li, BS Liu, Zeng R, et al (2006) Computational Biology and Chemistry: Predicting O-glycosylation sites in mammalian proteins by using SVMs
  • K Roomp, N Beerenwinkel, T Sing, et al (2006) Springer Lecture Notes in Computer Science 4075: Arevir: A secure platform for designing personalized antiretroviral therapies against HIV
  • I Antes, SWI Siu, T Lengauer (2006) Bioinformatics: DynaPred: A structure and sequence based method for the prediction of MHC class I binding peptide
  • X Guo, R Liu, CD Shriver et al (2006) Bioinformatics: Assessing semantic similarity measures for the characterization of human regulatory pathways
  • N Beerenwinkel, T Sing, T Lengauer et al (2005) Bioinformatics: Computational methods for the design of effective therapies against drug resistant HIV strains

Software by other groups which has components for classifier evaluation

  • BioConductor has a ROC package.
  • Weka has an evaluation package, with a couple of performance measures.

Contact

  • Just send a mail to Tobias Sing or Oliver Sander. Questions, comments, and suggestions are very welcome. We are also interested in seeing how ROCR is used in publications. Thus, if you have prepared a paper using ROCR we'd be happy to know.