wv 2009 jpd 2017
This scheduler is particularly apt for parallel computer systems, where the jobs get exclusive access to the nodes they get from the batch system. Here we try to explain how it basically works.
In this picture, the horizontal axis represents the processors in the system, the vertical axis the time. At this moment, the machine is empty. A job is represented as a rectangle: horizontal the number of processes the job needs, vertical the time the job needs.
In this picture we see that two jobs are running (job A and job B) and that job C has to be scheduled. It is obvious that there is place enough to accommodate job C, so it will run immediately.
Now, with the appearance of job D, there is a problem: there are not enough nodes to accommodate that job. The scheduler places job D on top of job C. Note that in the mean time, the running jobs are becoming shorter.
Jobs E and F arrived. We see, that job E can be scheduled before job D, because job D can only run after job C finishes. Job F can be scheduled alongside job C, without hindering any other job, so it starts immediately.
Now job G, a rather large job, arrives. There is no space for it below job D, so it will be put on top of job D.
Job B has ended. Job G could start now, and, indeed other schedulers would do so. In the Moab case, however, job G has to wait because other jobs were submitted earlier. The owner of job G could be surprised: there are enough nodes free for his job, but his job is not started! Job H, however wil be started immediately, it fits nicely in the hole under job D. Note, that starting job H does not delay the expected start times of other jobs.
Of course, some simplifications were made: the pictures suggest that a job always gets a contiguous row of nodes, which in general is not the case. Also jobs tend to run in shorter time then the submitter asks for. But these things do not matter too much, the scheduler determines what has to be done after each event: the appearance of a job, the termination of a job and so on. Running jobs are fixed, they keep running, but the scheduler takes the incoming jobs in order of appearance and plays a kind of 'tetris' with them as we showed in the example. It is obvious, that a submitted job is guaranteed to run in a predictable future. The only change can be that other jobs end earlier than predicted, so the job will start earlier.
This 'tetris' mechanism is very flexible. It is possible to give a job a higher priority by simply assuming that it was submitted earlier, or vice versa, that it was submitted later. When one uses this to alter priorities, the starting time of jobs is less predictable, but still, there is a guarantee that eventually the job will run. This mechanism is used to implement a 'fair share' algorithm: when a user, or group of users are using more of the system than their share, the priority of the corresponding jobs are lowered.
In practice some extra measures are taken to prevent that one user fills a whole machine with his jobs for a week or more. The above mentioned priority mechanism can be used for that purpose.
For users of a Moab scheduler it is important to realize that the more accurate the estimation of the running time of a job is, the more chances there are that the job can get scheduled in a hole. For example: job G could have been scheduled after job B, if the estimated running time was shorter.
The SURFsara Data Archive allows the user to safely archive up to petabytes of valuable research data.
Persistent identifiers (PIDs) ensure the findability of your data. SURFsara offers a PID provisioning service in cooperation with the European Persistent Identifier Consortium (EPIC).
B2SAFE is a robust, secure and accessible data management service. It allows common repositories to reliably implement data management policies, even in multiple administrative domains.
The grid is a transnational distributed infrastructure of compute clusters and storage systems. SURFsara is active as partner in various...
Spider is a dynamic, flexible, and customizable platform locally hosted at SURF. Optimized for collaboration, it is supported by an ecosystem of tools to allow for data-intensive projects that you can start up quickly and easily.
The Data Ingest Service is a service provided by SURFsara for users that want to upload a large amount of data to SURFsara and who not have the sufficient amount...
The Collaboratorium is a visualization and presentation space for science and industry. The facility is of great use for researchers that are faced with...
Data visualization can play an important role in research, specifically in data analysis to complement other analysis methods, such as statistical analysis.