Phred score detector

When using old sequence data you have to know what phred score system it is using. Almost all new data is going to use Phred33, but the old stuff could also be Phred64. There are scripts on this website that have ways of automatically detecting it http://wiki.bits.vib.be/index.php/Identify_the_Phred_scale_of_quality_scores_used_in_fastQ

I took the perl script from this website and parred down the output so that it works in pipelines. It outputs either “phred33” or “phred64”

fastq_detect_minimal.pl

An example pipeline in bash is:

coding=”$(perl $bin/fastq_detect_minimal.pl ${name}_1.fq)”

java -jar trimmmomatic-0.32.jar -$coding sample1.fq ….etc

NOTE: Phred64 should only go up to 104. There are some RNA seq samples (and probably others) that go up to 105. The original script output an error, while my version just says phred64. I hope that running it through phred conversion in trimmomatic fixes the issue but I am not sure.