Question: Count Reads Mapping To Introns, Extragenic Regions
    
    0
        
Alex Koeppel • 10 wrote:
Hello everyone,
I have some SAM/BAM files containing the alignments of some RNA-seq
reads
to hg19.  I'm interested in calculating some mapping statistics,
specifically, the percentage of reads mapping to exons, introns, and
extragenic regions.
I gather that this can be done with bedtools, but I'm finding myself a
little bit stuck just figuring out what files I need to get this
information.  I gather that I need a GTF (or possibly GFF) file, and I
downloaded one from the UCSC browser using the settings in the
attached
image.
The first couple lines of the resulting file are pasted below.  I see
that
the file has exon start and end sites.  Is there a way to get what I
need
with this file, or do I need something else?
Any assistance would be much appreciated,
Thanks
Alex
cat gencode.gtf | head -3
#bin    name    chrom   strand  txStart txEnd   cdsStart        cdsEnd
 exonCount       exonStarts      exonEnds        score   name2
cdsStartStat    cdsEndStat      exonFrames
0       ENST00000237247.6       chr1    +       66999065
67210057
     67000041        67208778        27
 66999065,66999928,67091529,67098752,67099762,67105459,67108492,671092
26,67126195,67133212,67136677,67137626,67138963,67142686,67145360,6714
7551,67149789,67154830,67155872,67161116,67184976,67194946,67199430,67
205017,67206340,67206954,67208755,
 66999090,67000051,67091593,67098777,67099846,67105516,67108547,671094
02,67126207,67133224,67136702,67137678,67139049,67142779,67145435,6714
8052,67149870,67154958,67155999,67161176,67185088,67195102,67199563,67
205220,67206405,67207119,67210057,
     0       SGIP1   cmpl    cmpl
 -1,0,1,2,0,0,0,1,0,0,0,1,2,1,1,1,1,1,0,1,1,2,2,0,2,1,1,
0       ENST00000371039.1       chr1    +       66999274
67210768
     67000041        67208778        22
 66999274,66999928,67091529,67098752,67105459,67108492,67109226,671366
77,67137626,67138963,67142686,67145360,67154830,67155872,67160121,6718
4976,67194946,67199430,67205017,67206340,67206954,67208755,
66999355,67000051,67091593,67098777,67105516,67108547,67109402,6713670
2,67137678,67139049,67142779,67145435,67154958,67155999,67160187,67185
088,67195102,67199563,67205220,67206405,67207119,67210768,
  0       SGIP1   cmpl    cmpl
 -1,0,1,2,0,0,1,0,1,2,1,1,1,0,1,1,2,2,0,2,1,1,
            
    ADD COMMENT
    • link
    
    
        
        •
        
            
            
        
    
    
            
            modified 5.7 years ago
            
                by 
Jennifer Hillman Jackson ♦ 25k
            
            •
        
        written 
        5.8 years ago by 
Alex Koeppel • 10
    
            