Approximate Bayesian Computations

Posted on February 14, 2014 by Sariel Hubner

In many cases it may be more straightforward (and informative) to test specific models using our data. An interesting approach for inferring population parameters and/or model testing is approximate Bayesian computations (ABC). There are several available tools such as msBayes, DIYABC, PopABC abctools R package, ABCTools.

Although ABC is a powerful and useful approach it has some caveats, e.g. choice of summary statistics, number and complexity of the models tested, amount of data and more. For realistic expectations and simple models ABC could really add some interesting insights to popgen studies.

Estimating Insert Sizes

Posted on August 12, 2013 by Thuy

We recently had some trouble estimating insert sizes with our Mate Pair (aka Jumping, larger insert sizes) Libraries. All the libraries sequenced by Biodiversity and the Genome Sciences Centre (GSC) were shockingly bad, but the libraries sequenced by INRA were very good. For example, according to the pipeline, the GSC 10kbp insert size library had an average 236bp insert size, but the INRA 20kb library an average insert size of 20630bp.

See the histogram for the 10kbp library:

Continue reading →

GBS, coverage and heterozygosity

Posted on June 27, 2013 by Greg Owens

I’m running some tests on my GBS data to look for population expansion. I know from looking at GBS data from an F1 genetic mapping population that for GBS data heterozygotes can be under called due to variation in amplification and digestions. Also, for my data observed heterozygosity is almost always under expected. Heterozygotes can also be overcalled when duplicated loci are aligned together. The tests I’m going to use explicitly use observed heterozygosity so this is worrying.

Continue reading →

Global climate and soil data (Kathryn)

Posted on April 27, 2012 by Kathryn

There are a few publicly available data sets that are useful for looking at the abiotic environments of specific locations.

Continue reading →

SNP summary statistics in R: ‘hierfstat’ is back and better than before! (Rose)

Posted on January 2, 2012 by Rose

After being disabled and not supported for several months, ‘hierfstat’ (by Jerome Goudet) now has lots of useful (and fast) calculations of summary statistics, including expected and observed heterozygosity, Fst and Jost’s Dest.

Continue reading →

STACKS installation (Rose)

Posted on December 12, 2011 by Rose

Installing stacks on Ubuntu Natty Narwhal or Oneiric Ocelot

STACKS is a piece of software produced by Julian Catchen in the Cresko lab. It’s designed to identify loci and alleles from RAD (or GBS) reads either de novo or after alignment to a reference. It consists of several modules that can be run separately, but to completely install it as a pipeline, it relies on a web server, unfortunately. Many of the required instructions are given in the README file, but because nobody in our lab is an expert on this, we had to fiddle around to get the program running on our Ubuntu machines.

Continue reading →

How to post – code (Dan E.)

Posted on November 16, 2011 by Dan E.

We have a problem sharing code via RLR.

The Problem

Unfortunately WordPress has a list of acceptable file types that it allows to be uploaded to our media library and none of the useful coding file types are on that list. The list is simply a list of acceptable file extensions. This means if you write a useful R script (or perl or python) script and save it with a standard file extension, like .R or .pl, WordPress will not allow you to upload it to the RLR media library so that you can share it via a post.

The Solution

The list of acceptable file extensions can be hacked and I might give it a try but, until I do, you will have to do one of these things:

Change the file extension. If you save your script as a .txt file it will upload fine. You should make it clear in your post what kind of script it is and then people who download it can change the .txt extension to whatever they want.
Put the code in your post. If your script is not too long you can simply copy and paste the code from your text editor into the post editor. The formatting of the code will remain true to the original so users can simply copy and paste it back out into a text editor or R-Studio or wherever. See Rose’s post about plotting STRUCTURE results for an example of this.
Compress your script file. If your script is big you can try zipping it and then uploading the compressed file. Users can then just download and unzip it. [As of November 2011 this hasn’t been tested.]

Dan E.

R script for plotting STRUCTURE results (Q values) (Rose)

Posted on November 16, 2011 by Rose

This is an R Script that plots individual Q values and labels populations. It can be modified to take average group membership from CLUMPP output and/or to import different population names and higher level groupings from elsewhere.

N.B. I haven’t run this on very many data sets, so it will probably need to be tweaked for your results. But please leave a comment if you run into any problems.

Continue reading →

Installing R packages and running scripts from the command line (Seb)

Posted on October 28, 2011 by Seb

Installing R packages from the command line (on the cluster and/or on your own computer)

Note that in this post, lines preceded by the dollar sign ($) mean commands typed directly from your shell session. Lines preceded by the greaterthan sign (>), mean commands typed from an R session.
Continue reading →

Rieseberg Lab Resources

RLR: Technical resources for Rieseberglers

Category Archives: Stats

Approximate Bayesian Computations

Estimating Insert Sizes

GBS, coverage and heterozygosity

Global climate and soil data (Kathryn)

SNP summary statistics in R: ‘hierfstat’ is back and better than before! (Rose)

STACKS installation (Rose)

How to post – code (Dan E.)

The Problem

The Solution

R script for plotting STRUCTURE results (Q values) (Rose)

Installing R packages and running scripts from the command line (Seb)