Question: Long wait times
1
luka.kremic • 20 wrote:
Hi,
I understand there was a clustering issue this weekend and it has been resolved. I re-entered my .bam files in Cufflinks to run over 24 hours ago and they are still in the queue to begin running. Is this expected as an aftermath due to the clogged queue? How long should I expect until the files will begin to run? Is there a recommended course of action I can take to speed up the time to run?
Thanks
ADD COMMENT
• link
•
modified 18 months ago
by
Jennifer Hillman Jackson ♦ 25k
•
written
18 months ago by
luka.kremic • 20
Hi Luka,
There is a possibility that your job will not restart on its own. I would go ahead and rerun the job for this particular case and time. Anyone else with a larger job that has had it queued for longer than a day, during this specific time window ONLY, should do the same.
We will be upgrading the cluster systems sometime in the next week or so, which will eliminate/reduce these sorts of issues. Look for the upcoming banner notice on the main server to know exactly when this will occur (http://usegalaxy.org). Thanks for sticking with us during the transitions to the larger clusters. These allow many more jobs to be processed yet still need a bit of tuning.
Take care, Jen, Galaxy team
Hi Jen,
I followed your advice and resubmitted 12 fastq files for Tophat yesterday (24hours ago) after these failed to run over the weekend. All are still grey with an ! (none queued with the hour glass). I've already uploaded these fastq files twice last week (once just before the error that paused all jobs).
My question is, should I purge all the data and start fresh again or is the queue just very busy and my jobs will run eventually?
Thanks!
(Thanks to Luka for submitting this question)
Hi, The cluster is busy from my experiences in launching jobs today. I would give it some more time. If another day passes, please let us know, we'll probably ask for a shared history link to determine the root cause. Very sorry for the continued problems, I know these can be difficult to sort out right now. So you know, reloading data should not be necessary with the current conditions, if that comes up as a problem, we'll include that in replies (has been very rare in the past). Jen
Thanks Jen! I'll keep you posted.
Hi Jen,
No movement at all again 24 hours later. I reuploaded the smallest fastq file as a test, and the fastq groomer job started (turned yellow) within 1 minute this morning. I'd rather not reupload the entire data set if I can avoid it. What do you suggest as the next step?
Thanks!
Hi - Please send an email to galaxy-bugs@lists.galaxyproject.org with a shared history link. The history should contain all inputs, stalled jobs, and outputs (failed or successful) as active datasets, not deleted. Please include a link to this post in the mail and send it from your registered account email at http://usegalaxy.org.
How to share a history: https://galaxyproject.org/learn/share/
Thanks Jen. I sent the email.
Thank you for sending this in. We are still working out the cluster issues. More feedback soon, hopefully with notice that the problems are resolved.
This is still the current status that includes a potential temporary workaround: https://biostar.usegalaxy.org/p/23068/#23188
Hi Jennifer, I set my Cufflinks jobs to run again and deleted all the old runs (on Monday after the update) and I still cannot get the cufflinks to begin running. There is an "!" beside all the job titles and the TopHat I'm running it from has, "An error occurred setting the metadata for this dataset" warning. Could either of these be a reason for my job not running? What are my options? Should I restart from scratch? Thanks
Yes, the metadata correction on the Tophat jobs should be done first, then the Cufflinks jobs started second. Rerunning Tophat will not necessary result in a dataset without metadata problems and creates extra jobs that are not needed. The data is there, it just has some technical issues due to the way to the tool interacts with Galaxy.
Correct the metadata by using the link in the warning or by clicking on the pencil icon and "redetecting" metadata on the Edit Attributes form (the first tab, a button near the bottom). Allow that to complete before starting downstream jobs.
Tophat is problematic and we plan to replace it completely with HISAT near term (at http://usegalaxy.org and within Galaxy training materials). You might consider making the switch now if using a workflow. The interruption of resetting Tophat metadata often means that workflows need to be split up - one for analysis through Tophat and one for downstream analysis - or the history can get very confusing fast with paused/restarted datasets.