Question: HELP! My HT seq table counts are in zero
0
gravatar for j.gonzalez10
18 months ago by
j.gonzalez100 wrote:

Hello!!

I am running HT seq using my sam files (obtained from bowtie) and the gtf file I obtained from Ensemble for Phytophthora infestans. Everything is ok except that I am obtaining zero counts on my HTseq output. I have read in other post that it may be because my gtf file uses different identifiers than my sam output. But I really don't know how to fix it and it is somehow urgent! I am analyzing small RNAseq data in four different samples so I need the table counts in order to see if small RNAs are differently expressed.

Thanks!!!!

I will remain attentive!!!

Best, Juliana

ADD COMMENTlink modified 18 months ago by Jennifer Hillman Jackson25k • written 18 months ago by j.gonzalez100
1

Post the first 5-10 lines of the SAM file and the first 5-10 lines of the GTF file.

ADD REPLYlink written 18 months ago by Devon Ryan1.9k

Thanks!

First seven of the SAM file:

@HD VN:1.0 SO:unsorted

@SQ SN:NW_003302556.1 LN:4850

@SQ SN:NW_003302557.1 LN:5561

@SQ SN:NW_003302558.1 LN:4798

@SQ SN:NW_003302559.1 LN:40370

@SQ SN:NW_003302560.1 LN:4805

@SQ SN:NW_003302561.1 LN:5155

First seven of the GTF file:

supercont1.1 broads exon 10097 10114 . - . transcript_id "transcript:PITG_00002T0"; gene_id "gene:PITG_00002"; supercont1.1 broads exon 10171 10433 . - . transcript_id "transcript:PITG_00002T0"; gene_id "gene:PITG_00002"; supercont1.1 broads exon 10474 10522 . - . transcript_id "transcript:PITG_00002T0"; gene_id "gene:PITG_00002"; supercont1.1 broads CDS 10100 10114 . - 0 transcript_id "transcript:PITG_00002T0"; gene_id "gene:PITG_00002"; supercont1.1 broads CDS 10171 10433 . - 2 transcript_id "transcript:PITG_00002T0"; gene_id "gene:PITG_00002"; supercont1.1 broads CDS 10474 10522 . - 0 transcript_id "transcript:PITG_00002T0"; gene_id "gene:PITG_00002"; supercont1.1 broads exon 38775 39071 . + . transcript_id "transcript:PITG_00003T0"; gene_id "gene:PITG_00003";

ADD REPLYlink modified 18 months ago • written 18 months ago by j.gonzalez100
1

Devon's thought was right I suppose. Your BAM/SAM file was aligned to sequences named 'NW_003302556.1' etc while your GTF file contains sequences named 'supercont1.1'. These do not correspond causing HTSeq to not find reads aligned to 'supercont1.1'.

ADD REPLYlink written 18 months ago by y.hoogstrate460

Thank you very much! That was indeed the problem.

ADD REPLYlink written 18 months ago by j.gonzalez100
1
gravatar for Devon Ryan
18 months ago by
Devon Ryan1.9k
Germany
Devon Ryan1.9k wrote:

The easiest solution to this is to realign to the fasta file associated with your GTF file. Then htseq-count will be happy. While you could use samtools reheader (it's available in Galaxy) to fix this by manually converting the chromosome names, it's so easy to make a ruinous mistake there that I wouldn't advise it.

ADD COMMENTlink written 18 months ago by Devon Ryan1.9k

Thank you Devon!!! That was the most adequate solution!!!

ADD REPLYlink written 18 months ago by j.gonzalez100
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 171 users visited in the last hour