These pages provide hints for data analysis using R, emphasizing methods covered in the graduate course, Biol 501: Quantitative methods in ecology and evolution.
Download R from the CRAN website.
Mac OS X users: Some function require that you also install the latest version of the XQuartz package.
Many students run R within the Rstudio environment, which you can download separately.
Use a text file to write and edit your R commands. This keeps a record of your analyses for later use, and makes it easier to rerun and modify analyses as data collection continues.
R has a built-in editor that makes it easy to submit commands selected in a script file to the command line. Go to “File” on the menu and select “New Document” (Mac) or “New script” (PC). Save to a file with the
.R extension. To open a preexisting file, choose “Open Document” or “Open script” from the “File” menu. Execute a line of command by placing the cursor on the line and pressing the keys <command><return> (Mac) or <control>R (PC).
R will start up if you double click a “.R” file. If this happens, R might not load the workspace. Enter
load(".RData") in the command window and all will be well.
Here are some very basic tips when writing script.
- Use a new script file for each project.
- Write lots of notes in the script file to record how and why you did that particular analysis. This is essential when reviewing it weeks (years) later. Annotate as though someone else will be reading your script later and attempt to duplicate your effort and make sense of it.
- When possible, write generic code that can easily be extended to other situations with a minimum of editing. For example, write code to read values of
yfrom a data file rather than code the points in an R script file.
R has a core set of command libraries (base, graphics, stats, etc), but there is a wealth of add-on packages available (the full list is available at the CRAN web site).
Packages already included
The following are a few of the add-on packages already included with your standard R installation.
boot – bootstrap resampling
foreign – read data from files in the format of other stats programs
lattice – multi-panel graphics
MASS – software and data associated with the book by Venables and Ripley:
Modern Applied Statistics with S-PLUS
mgcv – generalized additive models
lme4 – linear mixed-effects models; general least squares
To use one of them you need to load it,
You’ll have to do this again every time you run R.
To see all the libraries available on your computer enter
Example packages available for download
Most R packages are not included with the standard installation, and you need to download and install it before you can use it. Here are a few add-on packages that might be useful in ecology and evolution. The full list of available packages is here.
ape – phylogenetic comparative methods
car – linear model tools (e.g., alternative sums of squares)
leaps – all subsets regression
emmeans – group means for ANOVA and other linear models
meta – meta-analysis
multcomp – multiple comparisons for linear models
popbio – analyzing matrix population models
pwr – power analysis
qtl – QTL analysis
shapes – geometric morphometrics
vegan – ordination methods for community ecology
visreg – visualize linear model fits
To install one of these packages use the menu bar in R. Select “Install packages” under the “Packages” menu item. You’ll have to select a download site (Canada BC). Then select your package from the list provided.
Or, execute the following command instead of using the menu,
To use a package once it is installed, load it by entering
R is under constant revision, and periodically it is a good idea to install the latest version. Once you have accomplished this, you should also download and install the latest version of all the add-on packages too.
Use “?” in the R command window to get documentation of specific command. For example, to get help on the “mean” function to calculate a sample mean, enter
You can also search the help documentation on a more general topic using “??” or “help.search”. For example, use the following commands to find out what’s available on anova and linear models.
??anova ??"linear models" # same as help.search("linear models")
A window will pop up that lists commands available and the packages that include them. To use a command indicated you might have to load the corresponding library. (See “Add-on packages” for help on how to load libraries.) Note the “??” command will only search documentation in the R packages installed on your computer.
Interpreting a help page
As an example, here’s how to interpret the help page for the sample mean, obtained by
In the pop-up help window, look under the title “Usage” and you will see something like this:
mean(x, trim = 0, na.rm = FALSE, ...)
The items between the brackets “()” are called arguments.
Any argument without an “=” sign is required — you must provide it for the command to work. Any argument with an “=” sign represents an option, with the default value indicated. (Ignore the “…” for now.)
In this example, the argument “x” represents the data object you supply to the function. Look under “Arguments” on the help page to see what kind of object R needs. In the case of the mean almost any data object will do, but you will usually apply the function to a vector (representing a single variable).
If you are happy with the default settings, then you can use the command in its simplest form. If you want the mean of the elements in the variable “myvariable”, enter
If the default values for the options don’t meet your needs you can alter the values. The following example changes the “na.rm” option to TRUE. This instruct R to remove missing values from the data object before calculating the mean. (If you fail to do this and have missing values, R will return “NA”.)
The following example changes the “trim” option to calculate a trimmed mean,
R commands to analyze the data for all examples presented in the 2nd edition of The Analysis of Biological Data by Michael Whitlock and Dolph Schluter are here.
Several excellent R books are available free to UBC students online through the UBC library. Check the “Books” tab on the main course page.
Tom Short’s R reference card
An R blog! Daily news and tutorials about R.
Someone has solved your problem already
If you want to accomplish something in R and can’t quite figure out how, and your books aren’t helping, chances are that someone has already solved the problem and the answer is sitting on a web page somewhere on the internet. Google or the R project Search Engine might find it for you.