When analyzing genomic data, we first need to align to the genome. There are a lot of possible choices in this, including BWA (medium choice), stampy (very accurate) and bowtie2 (very fast). Recently a new aligner came out, NextGenMap. It claims to be both faster and deal with divergent read data better than other methods. Continue reading
There are two main ways to barcode WGS libraries so that they can be run together on a same lane:
– In-line barcodes: unique sequences are located at the very end of one or both adapters. This sequence will be at the very beginning of each read from a given library. This is the barcode system that is normally used fro GBS libraries as well.
– Indices: barcodes are in the middle of one or both adapters. These barcodes are read through an independent round of sequencing. For a paired-end library you would have therefore two rounds of sequencing of your fragment and a third round of sequencing for the index (and I guess a fourth one as well, if you have double indices). This is the system used in most commercial kits.
As you know, the sunflower genome contains a large amount of repetitive sequences, that is why it is so big and so annoying to sequence. I have been working for a while on optimizing a depletion protocol, to try to get rid of some repetitive sequences in NGS libraries (transposons, chloroplast DNA…). Continue reading
As some of you might know, I have been working for the last few months on optimizing a protocol for Illumina WGS libraries that will reduce our dependency on expensive kits without sacrificing quality. The ultimate goal would be to be able to use WGS libraries as a more expensive but hopefully more informative alternative genotyping tool to GBS. Getting to that point ideally requires to develop:
1) A cheaper alternative for library preparation (this post)
2) A reliable multiplexing system (this other post)
3) A way to shrink the sunflower genome before sequencing it (because, as you know, it’s rather huge) (yet another post)
The following protocol is for non-multiplexed libraries. The protocol for multiplexed ones is actually identical, you just need to change adapters and PCR primers – more about that in the multiplexing post.
If you are planning to pool libraries and deplete them of repetitive elements, read carefully all three posts before starting your libraries (mostly because you might need to use different adapters and PCR primers)
Chris and I devised and implemented this method to identify contaminant DNA in our HA412 454 and Illumina WGS reads.
N.B. I will update this posts with links to the scripts once I get them up on the lab’s bitbucket account, plus extra details if I think they are necessary on review.
This post describes the steps I took to assemble plastid genomes from low-coverage WGSS data. An overview of the approach can be found here.
Essentially, the method involves first mapping of quality-filtered reads to a reference plastid genome, and only selecting plastome-like reads from this mapping step for subsequent de-novo assembly. For the assembly step, I used the VELVET assembler, which performs well for small genomes and is quite fast.
I’ve been making whole genome shotgun sequencing libraries (for the purposes of this post: WGSS libraries) to sequence sunflower genomes on the Biodiversity Centre’s Illumina HiSeq. I haven’t been doing it for very long and its likely that my approach will change in the future as costs and products change but, as of early 2012, I’ve landed on a hybrid protocol based on kits from an outfit called Bioo Scientific. I use the Bioo Sci. adapter kit and their library prep kit up to the final PCR step at which point I switch to a PCR kit from another outfit called KAPA. I also use a KAPA kit to quant libraries with qPCR. In this post I give a little context then describe what I do to make WGSS libraries . . . Continue reading
As of March 2012 we are using the Bioo Scientific NEXTflex barcoded adapters for WGS sequencing libraries made by ourselves, (well me so far). The set we are currently using comprises 48 barcodes, so we can multiplex up to a 48-plex in one lane on the Illumina HiSeq sequencer.
Below are the sequences of the Illumina adapters and the 48 barcodes we are currently using. Continue reading
This post is about fragmenting, or shearing, genomic DNA to a particular size range using the Rieseberg lab’s Bioruptor sonicator.
Most of the current whole genome shotgun (WGS) library preparation protocols for NGS applications start with fragmented DNA. Generally speaking, this starting DNA should be a certain size and, for multiple samples, consistently that size. This objective turns out to be quite a tricky thing to accomplish with the Bioruptor. Given that WGS sequencing will probably continue to be popular in the lab, I am posting here what I have learned so far about taking whole genomic sunflower DNA and smashing it to the size range that I want using the Bioruptor. If I discover anything else in future library preps I’ll add it below. If anybody else has useful tips please comment.