SNP calling with ML (Greg B.)

Email for v7. Bug found that printed G’s as C’s and vise versa.

Call SNPs from sam files in a method similar to Hohenlohe et al 2010. Updated to v4 Feb 9. Previous version had a bug.

It is now fixed up for all of BWAs cigars flavors.

This only deals with reads that fit one of the following:
Full alignment (55M)
Soft clip at the start (10S45M)
Soft clip at end (45M10S)
One deletion (25M10D25M)
One insertion (20M10I20M)

This means it ignores reads with a cigar fields that have N, H, P, = or X and it ignores reads with a cigar more complicate then a single soft clip or a single indel. It also does not penalize reads adjacent to indels.

It ignores bases in soft clipped parts of reads


Hopefully Rose and/or Chris can comment on the specifics of the algorithm as we have implemented here.

#usage
perl bin/rad_lrt_or_ml_v2.pl sam_files.list snp_calls/project1

sam_files.list is a list of the samfiles you want parsed. Simply:

/home/greg/project/sam_files/sample1.sam
/home/greg/project/sam_files/sample2.sam
/home/greg/project/sam_files/sample3.sam
...

The last argument is where you want them printed, here I have it go to a new folder (which has to exist) and append project1 onto the name. The out put will look something like this:

/home/greg/project/snp_calls_project1_calls_sample1.sam
/home/greg/project/snp_calls_project1_counts_sample1.sam
/home/greg/project/snp_calls_project1_calls_sample2.sam
...

For your parsing pleasure you can have it print out the read depth count for each nt for each location just set line 14 to “true” (or anything).

#line 14 print out counts
my $print_counts = "T";
#if you do not want the counts
my $print_counts;

Next merge the SNP calls
Back to population genomics

Hohenlohe PA, Bassham S, Etter PD, Stiffler N, Johnson EA, et al. 2010 Population Genomics of Parallel Adaptation in Threespine Stickleback using
Sequenced RAD Tags. PLoS Genet 6(2): e1000862. doi:10.1371/journal.pgen.1000862

One thought on “SNP calling with ML (Greg B.)

Comments are closed.