How to run HTseq with galaxy in the main webiste or locally with our own server?
If we cannot run HTseq, is it possible to run cufflinks without separating different isoforms?
How to run HTseq with galaxy in the main webiste or locally with our own server?
If we cannot run HTseq, is it possible to run cufflinks without separating different isoforms?
Hello,
HTseq is a wrapped tool available from the Tool Shed for use in a local/cloud Galaxy.
If you wish to use Cufflinks instead, and to avoid all but the primary isoform, use a reference annotation dataset (GTF or GFF3) that has only the transcripts in it that you wish to consider. Certain sources, such as UCSC's "UCSC Genes" track, have a primary transcript per-gene bound identified. This, in the simplest interpretation, is the "most complete" (5' and 3' complete) and "longest" transcript, but there are some tie-breaking rules. Note that this particular reference GTF will not contain all of the attributes that Cuffdiff can make use of (if you go that far with this pipeline). You may wish to instead filter out the iGenomes version of the reference annotation using similar logic (or however you wish to pick the transcript to represent the gene bound), if an iGenomes dataset exists for your target genome.
Then set the option "Count hits compatible with reference RNAs only" as "Yes".
It is also possible to include a "mask" file (transcripts to ignore) under advanced settings, and put in any isoforms that you wish to leave out of the analysis, but the results will not quite be the same (novel splices in your samples will still appear in the results).
Then later when using Cuffdiff, use that same reference annotation file to restrict what is reported (Cuffmerge would not be needed).
Hopefully this helps! Jen, Galaxy team
Thanks a lot. I will try to install on our server.
Yufang
From: Jennifer Hillman Jackson on Galaxy Biostar [mailto:notifications@biostars.org] Sent: Wednesday, July 15, 2015 12:47 PM To: Yufang Jin Subject: [galaxy-biostar] A: how to run HTseq with galaxy
Activity on a post you are following on Galaxy Biostar<http: biostar.usegalaxy.org="">
User Jennifer Hillman Jackson<http: biostar.usegalaxy.org="" u="" 254=""/> wrote Answer: how to run HTseq with galaxy<http: biostar.usegalaxy.org="" p="" 13065="" #13148="">:
Hello,
HTseq is a wrapped tool available from the Tool Shed<http: usegalaxy.org="" toolshed=""> for use in a local/cloud Galaxy.
If you wish to use Cufflinks instead, and to avoid all but the primary isoform, use a reference annotation dataset (GTF or GFF3) that has only the transcripts in it that you wish to consider. Certain sources, such as UCSC's "UCSC Genes" track, have a primary transcript per-gene bound identified. This, in the simplest interpretation, is the "most complete" (5' and 3' complete) and "longest" transcript, but there are some tie-breaking rules. Note that this particular reference GTF will not contain all of the attributes that Cuffdiff can make use of (if you go that far with this pipeline). You may wish to instead filter out the iGenomes version of the reference annotation using similar logic (or however you wish to pick the transcript to represent the gene bound), if an iGenomes<http: support.illumina.com="" sequencing="" sequencing_software="" igenome.html=""> dataset exists for your target genome.
Then set the option "Count hits compatible with reference RNAs only" as "Yes".
It is also possible to include a "mask" file (transcripts to ignore) under advanced settings, and put in any isoforms that you wish to leave out of the analysis, but the results will not quite be the same (novel splices in your samples will still appear in the results).
Then later when using Cuffdiff, use that same reference annotation file to restrict what is reported (Cuffmerge would not be needed).
Hopefully this helps! Jen, Galaxy team