Hi I have downloaded RNA_seq data as fastq files and aligned these against FASTA sequences as references, then generated GTF files. However, the coordinates of the transcripts, given by cufflinks, do not match with the length of the output sequences generated by the Extract Genomic DNA tool - these sequences, which represent repeated stretches, seem to be nearly perfect duplications of the sequences predicted by cufflinks. My question is if one can trust that the Extract Genomic DNA tool correctly output assembled sequences when it comes to repeated sequences.
Vidar