Hello,
I have RNAseq data aligned to my reference genome in BAM format. I would like to correlate base quality scores of mutated nucleotides with the type of transversion/transition (i.g. A->C, A->G, A->T, ..., T->G) in order to show that certain types of mutations are sequencing artifacts (i.e. have a lower quality score) and others are biologically relevant (i.e. have a high quality score). The data should be summarized in a bar+whisker diagram with median / 1st quartile / 3rd quartile / minimum / maximum values for each type of mutation. Is there a script available that can extract the relevant quality scores of mutations from a BAM file and sort them by type of mutation? Basically anything in which I can input my BAM files and which will create a file with numbers that I can crunch into Excel would be very helpful. Unfortunately I don't know how to script myself since I am a trained virologist.
Thanks for your ideas and help!
Christian