PBS Directives
From UF HPC Wiki
Contents |
PBS Directives
Some of this information is at the Sample Scripts page.
Job settings that can be specified in the job submission script used by the qsub command. The PBS directives are given in comment lines, using the #PBS prefix.
You can put comments in using double #, i.e.:
#PBS -M YOUR_MAIL_ADDRESS_HERE ##a regular comment: send the abort/error messages to only my computer
In the case where multiple directives are given with the same option, only the last one will be used; the exceptions to this are the -l option, where each resource can be on a separate line (otherwise comma delimited), and the -W option, where additional attributes can be on separate lines
- Some of these may depend on Maui or Maui may not implement them (Maui does not support job defined priority attributes(-p), but job priorities can be set in other ways...)
- Items in bold are "interesting", items in italics are possible literal values, bracketed items are optional
- Not all directives are listed, some are obsolete, useless, or inappropriate for the most part
- Most of this information is compiled from the torque/maui wikis and the man pages.
Resource Specification
Specify resources required by the job using the -l directive.
The most common kind of resources you can specify are the time, processors, and memory your job needs.
The resources you require may have different limits from the queues different from the default (ie: the testq or altixq queues).
Examples
Walltime
You should add some cushion to your estimation of the time needed, and a shorter job will obviously be scheduled sooner
Run the job for 7 hours 30 minutes and 20 seconds:
#PBS walltime=07:30:20
Run the job for 100 hours:
#PBS walltime=100:00:00
Processors
Number of processors/nodes to use, and interconnect (gigabit ethernet or infiniband)
Run only across 1 node using only 1 processor per node, using infiniband interconnect.
#PBS -l nodes=1:ppn=1
Run only across 2 nodes using only 4 processors per node (so 8 processors total), using gigabit ethernet interconnect
#PBS -l nodes=2:ppn=4:gige
Run only across 2 nodes using only 4 processors per node (so 8 processors total), using infiniband interconnect
#PBS -l nodes=2:ppn=4:infiniband
Memory
Memory requirements
Like walltime, if it exceeds the memory requirements it will be killed:
The maximum amount of physical memory allowed by any single process, i.e.: use maximum of 400mb physical memory for each process (if it goes over it will be killed) (default is 600mb)
#PBS -l pmem=400mb
The maximum amount of virtual memory allowed by any single process, i.e.: use maximum of 2gb virtual memory for each process:
#PBS -l pvmem=2gb
Disk Usage
Limit the local disk usage allocated for the jobs
To limit the amount of disk space used by a job, use the following:
#PBS -l file=10gb
CPU Time
Limit the CPU time for the jobs You will probably only need to use the wall time above regardless, but just in case... (uses either seconds or HH:MM:SS)
Total CPU time allowed for all processes, i.e.: together all the processes will use no more then a total of 20 hours of CPU time:
#PBS -l cput=20:00:00
Maximum CPU time allowed for any single process, i.e.: no single process will use more than 5 hours of CPU time
#PBS -l pcput=05:00:00
Other Resources
More resource requirements
See this page on torque's wiki for more resources you can specify.
Some resources defined by torque and extended resources defined by maui may be simply ignored depending on the operating system or other settings: i.e.: the "mem" resource (total maximum physical memory used by the job) is ignored for Linux with a job using more than 1 node
Full List of Options
| option | value | description | examples | |
|---|---|---|---|---|
| a | YYYYMMDDhhmm | Submit the job on this "Release date" or "Job Execution time". This example submits on December 25, 2008, 4:35 pm | #PBS -a 200812251635 | |
| c | s
c c=minutes | Checkpoint interval: in the example checkpoint every 90 minutes
s : checkpoint when the server running the job was issued a shutdown command c : checkpoint at the default server minimum time c=minutes : checkpoint every so minutes of CPU time | #PBS c=90 | |
| e | absolute file path | Path to store standard error for the job | #PBS -e /scratch/ufhpc/YOURUSERNAME/test.err | |
| j | oe
eo | Merge the standard error and standard output intermixed in one file: oe to put it in the std output path, or eo to put it in the std err path | #PBS -j | |
| l |
walltime=[[HH:]MM:]SS | Resources the job requires. Can specify many types and combinations of resources, see more examples below.
Ex: Runs for 5 hours 30 minutes maximum. (killed if not done in 5 hours 30 minutes) | #PBS -l walltime=05:30:00 | |
| m | aben | Events for which to send mail to the MAIL USERS (see opt M), ie: send emails on any change in the job (started, finished, aborted)
Any combination of a (abort) b (started) or e (finished), or only n (for send no email). The second example sends emails on abort and finished. | #PBS -m abe
#PBS -m ae | |
| o | absolute file path | Path to store standard output for the job | #PBS -o /scratch/ufhpc/YOURUSERNAME/test.out | |
| p | integer | Job Priority (not available since we are using Maui) | n/a | |
| q | queuename | Run the job on a specific queue (which has different resources already defined), i.e.: run on the test queue discussed in Job_Submission_Queues which has a 10minute limit:
or in the second example, run the job on altix (if you are authorized): | #PBS -q testq
#PBS -q altixq | |
| r | y/n | Rerun the job (possibly): You really want to set this to n as it might do something you don't want | #PBS -r n | |
| t | n/a | Job arrays (still under development with torque 2.2) | ||
| v | VAR=VALUE,VAR=VALUE,... | Environmental variables in the list are exported to the job (see also -V) | #PBS -v DATAFILE=/scratch/crn/USER/etc | |
| M | email,email,email... | Send emails on the events (see -m option) specified to all these email addresses | #PBS -M YOUR_MAIL_ADDRESS_HERE,EXTRA_OPTIONAL_EMAIL_ADDRESSES_TO_SEND_TO | |
| N | string | Job Name | #PBS -N fft_test_23 | |
| S | full path | (I believe this is redundant when you put it at the top of the file with Torque, but it was used in OpenPBS though) Use a different interpreter then default if it is installed and as long as scripting language comment lines use #, (i.e.: bash,tcsh,tclsh,perl,etc..) | #PBS -S /bin/bash | |
| V | All environment variables in the qsub command's environment are to be exported to the batch job. | #PBS -V | ||
| W | stagein
stageout | Specify additional attributes, mostly advanced options (there are more attributes discussed that you should be able to use not in the man page available that are useful for job ordering or job dependencies):
stagein or stageout : copy files before a job starts (stagein) to the executions host, or copy files after the job ends (stageout) from the execution host: see this for more info You can use the before*/after*/sync*/on attributes to order jobs or manage job dependencies on a job script by job script basis. see the discussion link above covering all the attributes for -W) |
(not really used) |
