Recently BWA (an alignment program) suddenly started giving a strange error message, indicating that a reference file ending in *.nt.ann was missing. This file type was unfamiliar to me, with good reason: it’s a colourspace reference file, which shouldn’t be generated when we index the fasta-based references we’re using (at least, I don’t know of anyone in our lab using SOLID data as a reference). DO NOT rebuild the reference with the -c (colourspace) flag, as you might see suggested on the web, because we don’t know what effect that might have on our alignments. DO rebuild it with the usual settings.
This is a post about some time-saving help Chris Grassa gave me.
STACKS (post coming soon) doesn’t deal well with all of the unaligned reads in SAM files, so I tried using PICARD to remove them. However, PICARD doesn’t like the SAM output of BWA, but Chris G showed me how to use the Unix command awk to do it much more easily. This is his command for my file 1076.sam: