These pages provide hints for data analysis using R, emphasizing methods covered in the graduate course, Biol 501: Quantitative methods in ecology and evolution.

Mac OS X users: Some function require that you also install the latest version of the XQuartz package.

Many students run R within the Rstudio environment, which you can download separately.

## Use a script file

Use a text file to write and edit your R commands. This keeps a record of your analyses for later use, and makes it easier to rerun and modify analyses as data collection continues.

R has a built-in editor that makes it easy to submit commands selected in a script file to the command line. Go to “File” on the menu and select “New Document” (Mac) or “New script” (PC). Save to a file with the .R extension. To open a preexisting file, choose “Open Document” or “Open script” from the “File” menu. Execute a line of command by placing the cursor on the line and pressing the keys (Mac) or R (PC).

Here are some very basic tips when writing script.

• Use a new script file for each project.
• Write lots of notes in the script file to record how and why you did that particular analysis. This is essential when reviewing it weeks (years) later. Annotate as though someone else will be reading your script later and attempt to duplicate your effort and make sense of it.
• Write generic code that can easily be extended to other situations with a minimum of editing. For example, write code to read values of x and y from a data file rather than code the points in an R script file.

R will start up if you double click a script file. If this happens, R might not load the workspace. Enter load(“.RData”) in R’s command window and all will be well.

R has a core set of command libraries (base, graphics, stats, etc), but there is a wealth of add-on packages available (the full list is available at the CRAN web site).

• boot – bootstrap resampling
• foreign – read data from files in the format of other stats programs
• ggplot2 – graphics
• lme4 – linear mixed-effects models; general least squares
• MASS – package for the book by Venables and Ripley, Modern Applied Statistics with S-PLUS
• mgcv – generalized additive models

To use one of them you need to load it,

library(packagename)

You’ll have to do this again every time you run R.

To see all the libraries available on your computer enter

library() 

### Example packages

Most R packages are not included with the standard installation, and you need to download and install it before you can use it. Here are a few add-on packages that might be useful in ecology and evolution. The full list of available packages is here.

• car – linear model tools (e.g., alternative sums of squares)
• leaps – all subsets regression
• emmeans – group means for ANOVA and other linear models
• meta – meta-analysis
• pwr – power analysis
• qtl – QTL analysis
• shapes – geometric morphometrics
• vegan – ordination methods for community ecology
• visreg – visualize linear model fits

### Install an R package

To install one of these packages use the menu bar in R. Select “Install packages” under the “Packages” menu item. You’ll have to select a download site (Canada BC). Then select your package from the list provided.

install.packages("packagename", dependencies = TRUE)

To use a package once it is installed, load it by entering

library(packagename)

## Get help

### Built-in help

Use ? in the R command window to get documentation of specific command. For example, to get help on the mean function to calculate a sample mean, enter

?mean

You can also search the help documentation on a more general topic using ?? or help.search. For example, use the following commands to find out what’s available on anova and linear models.

??anova
??"linear models"  # same as help.search("linear models")

A window will pop up that lists commands available and the packages that include them. To use a command indicated you might have to load the corresponding library. (See “Add-on packages” for help on how to load libraries.) Note the ?? command will only search documentation in the R packages installed on your computer.

### Interpret a help page

As an example, here’s how to interpret the help page for the sample mean, obtained by

?mean

In the pop-up help window, look under the title “Usage” and you will see something like this:

mean(x, trim = 0, na.rm = FALSE, ...)

The items between the brackets “()” are called arguments.

Any argument without an “=” sign is required — you must provide it for the command to work. Any argument with an “=” sign represents an option, with the default value indicated. (Ignore the … for now.)

In this example, the argument x represents the data object you supply to the function. Look under “Arguments” on the help page to see what kind of object R needs. In the case of the mean almost any data object will do, but you will usually apply the function to a vector (representing a single variable).

If you are happy with the default settings, then you can use the command in its simplest form. If you want the mean of the elements in the variable myvariable, enter

mean(myvariable)

If the default values for the options don’t meet your needs you can alter the values. The following example changes the na.rm option to TRUE. This instruct R to remove missing values from the data object before calculating the mean. (If you fail to do this and have missing values, R will return NA.)

mean(myvariable, na.rm = TRUE)

The following example changes the trim option to calculate a trimmed mean,

mean(myvariable, trim = 0.1)

R commands to analyze the data for all examples presented in the 2nd edition of The Analysis of Biological Data by Whitlock and Schluter are here.

Several excellent R books are available free to UBC students online through the UBC library. Check the “Books” tab on the main course page.

Tom Short’s R reference card

Venables and Smith’s Introduction to R (pdf file — right-click and save to disk)

An R blog! Daily news and tutorials about R.