Hi everybody,
I'm totally new to RNA-seq analyses and it's my first time doing this and using Galaxy's Tuxedo Protocol. Can somebody please help me with some questions?
My experiment includes (3 biological replicates of each strain)
i. Strain A wildtype
ii. Strain A with gene deleted
iii. Strain A with genes introduced.
I would like to see how the genome gene expression changes in (ii) and (iii) as compared to (i).
1) Cuffcompare
Am I supposed to Cuffcompare all 12 samples (3 biological replicates of each in i, ii and iii)? Or just the wildtype strain (i) as I am comparing (ii) and (iii) to the wildttype?
2) Cuffcompare vs Cuffmerge
Should I use Cuffmerge or Cuffcompare? I've read the description "Cuffcompare Or Cuffmerge" but still have no idea about which to use, as I have no knowledge in bioinformatics. I've tried to use the Cuffmerge in Galaxy but it doesn't recognize the Cufflink files, unsure why.
3) Cuffdiff
For the "Transcript" input at the top, should I use the gff3 file I've downloaded from Ensembl Bacteria or the Cuffcompare file?
The output file from Cuffdiff seems to give me XLOC numbers, how can I get the corresponding gene names?
I've tried to search the genes using the 'location' from the Cuffdiff Gene Differential Expression Testing file, but the location doesn't tally with my gff3 annotation file which I've input during Cuffcompare.
By the way, is it possible to detect genomic DNA contamination from the RNA-seq data?
I apologize for the number of questions, but can anyone please advise?
Many thanks.
Howard
