Paraloop
Paraloop distributes your jobs on several processors of a machine, independently of its architecture: it may be a single SMP computer with shared memory, as well as a cluster, or even a network of workstations.
Paraloop is best suited for the use cases when we have a high number of independant tasks to execute, as is often the case in the data treatment pipelines found in bioinformatics projects.
Paraloop is a tool for programmers, who are able to easily distribute their jobs, while using the same script whatever the machine they run on. It is a perl object program: data treatment is wrapped inside an object (called a "plugin"), the code responsible for the machine interaction is wrapped inside another object (called "scheduler"). It is thus relatively easy to adapt paraloop to a new architecture: it just means writing a new scheduler (in fact, just a few methods). The same is true for plugins: they are able to read and treat data, using some particular format.
A few plugins are delivered with paraloop: some of them are specific to the bioinformatics field (one of them is useful to execute BLAST in parallel, for instance), while others are completely generic (reading a text file, ...). However, writing plugins dedicated for other thematic fields would be a quite useful task.
When used in a queue context, with a limited cpu time per job, it is possible to configure paraloop so that the current job is interrupted before being killed by the system; the job is resubmitted to the queue just before the interruption, so that it will be resumed as soon as permitted by the system.
Besides, paraloop includes a command to print the progress report of each job.
Finally, a "load balancing mode" is available: it may be used to insure that all the jobs take approximately the same time to execute.