I have created a workflow with 9 steps. When I run my workflow the first 2 steps are to input datasets. I have paired end matching data so ideally I would like to select all of the "read1".fastq files in step 1 and all of the "read2".fastq files in step 2. There is this button that looks like a magnet or linked chain next to the dataset. When I hover over it it reads, "This input is linked and will be run in matched order with other input datasets(ex: use this for matching forward and reverse reads). I thought, hooray my issue is solved, however the functionality of the button is not intuitive and when I try to use it it will run every sample in step 1 with every sample in step 2 , i.e.. sample1_read1.fastq with sample1_read1.fastq and also sample1_read1.fastq with sample3_read2.fastq. Has anyone figured this out or can provide a workaround?
HI I do successfully use this option to run on multiple parallel workflows on pairs of fasta files. I assume that you have two 'input dataset' steps one for the read1s and for read2s
1 First makes sure that your datasets are all in a consistent order in the same history (i.e. all read1s above read2s).
2run workflow
3 select the icon which looks like a pile of papers (ignore the icon that looks like a linked chain).
4 search the for all read 1s (using the search box at the bottom) in the read 1 input datasets-THEN SELECT ALL THE FILES using shift select- they will go green then grey.
5repeat step 3 but for read 2.
6 run workflow (sending each parallel workflow to separate histories, check box at the bottom, can make organisation easier).
Thank you for taking the time to respond. I was following a very similar method to what you described. The only difference was in my history "read1" files were under "read2" files in the list. I created a new history and imported the files so that they conformed to the order you described and the tool worked as expected this time.
That is great it worked.
To be honest point 1 is wrong. you do need to be carful about the list order (or you will end up with the wrong files being paired) but it is not important that read 1s are above read 2s as I incorrectly stated above. The only important thing is that that read1 and read 2 are in the same relative order. Just check that the two selected lists are in the same order before you hit start work flow.
If you find that you have imported files in the wrong order. you can use the 'users>saved datasets` to sort flies by name and then import them to a new history in an order that should work.
Feels kind of weird giving advice as I am quite new to galaxy and bioinformatics in general.