Hi, I'm trying to convert from BED to BAM using Galaxy, but though the tool runs fine, I get an error which makes the file unusable. I am using sorted BED files.
First 6 lines of my bed file:
Chrom Start End Name Score Strand
chr1 12634845 12634885 HWI-ST1365:57:C1V5AACXX:1:2105:13323:121743 255 -
chr1 12634846 12634886 HWI-ST1365:57:C1V5AACXX:1:2102:3391:61106 255 +
chr1 12634846 12634886 HWI-ST1365:57:C1V5AACXX:1:2302:7595:70795 255 -
chr1 12634847 12634887 HWI-ST1365:57:C1V5AACXX:1:2301:5626:107512 255 -
chr1 12634850 12634890 HWI-ST1365:57:C1V5AACXX:1:1303:5150:55453 255 -
chr1 12634852 12634892 HWI-ST1365:57:C1V5AACXX:1:1306:7619:104639 255 -
This is the error:
[bam_index_core] the alignment is not sorted (HWI-ST1365:57:C1V5AACXX:2:2301:15601:133048): 55116-th chr > 3056-th chr [bam_index_build2] fail to index the BAM file.
I would appreciate any suggestions on how to fix this as I need BAM file format to continue.
It seems like something is wrong with your BED file. The error suggests that there are > 55000 chromosomes, which is incredibly unlikely. My guess is that the name and chrom columns of your BED file are swapped in Galaxy.
As an aside, your BED file was made from a BAM file, so why not just use the original file?
Agree, checking the metadata for datasets is a good idea when errors come up (I checked and your data is correct).
The genome input file will still need to be correct - as explained in my other reply below. Even if you use the original BAM input instead.