4.4 years ago by
United States
Hello,
We can help .. here are the guidelines for common usage with the Tuxedo RNA-seq pipeline using a reference genome that is not-native to the Galaxy instance you happen to be working on:
1. The tuxedo pipeline will accept "reference annotation" in GTF or GFF3 format. GFF is not supported (will not contain the transcript/gene identifiers necessary to be useful). So you will be using this file: Malus_x_domestica.v1.0-primary.transcripts.gff3
2. The "reference genome" you need to be using is same consensus genomic backbone that the gff3 is based on (the chromosome identifiers and coordinates the transcripts/gene bounds are mapped to). This would be " Malus_x_domestica.v1.0"? You want the .fasta version of the genome loaded into Galaxy. As a "Custom Reference Genome".
3. You data for "Malus_x_domestica.v1.0.consensus2contigs.gff" and "apple_genome_contigs.nuc" maps back details about how the "reference annotation" was created from the source genomic contigs. Useful in case there is transcript assembly/gene assembly or splice variant question/discrepancy in a region and you wish to investigate (real or artifact/sequencing issue).
Key links to help and many more details (including tutorials, etc):
https://wiki.galaxyproject.org/Support#Tools_on_the_Main_server:_RNA-seq
https://wiki.galaxyproject.org/Support#Custom_reference_genome
https://wiki.galaxyproject.org/Support#Reference_genomes
I have made some assumptions from the given information, so please add clarification where I have misunderstood the context of the available inputs and we can work from there to further customize a solution.
Take care! Jen, Galaxy team