Site Map Online Directory
  Search Information Technology   Northwestern University  
YOU ARE HERE >   NUIT > SSCC > HOWTOs > Submit a Single Batch Job
Submit a Single Batch Job

About the SSCC

Cluster Report (NU Restricted)

HOWTOs

Bulletins

Statistical Software

Statistical Software Manuals

Additional Resources

Migration Information

Social Science Data Services

Kellogg Research Computing

Depot File Service

Improving Social Science Research Computing (PDF)

Contact List

Services

Get Connected

Support

Educational Resources

NUIT

Submit a Single Batch Job

printer friendly format

  1. Create a Shell Script for Your Job
  2. Application Line: MATLAB
    Application Line: Stata
    Application Line: SAS
    Application Line: R
    Application Line: Splus
    Application Line: GAUSS
    Application Line: Ox
    Application Line: Singular
  3. Make the Script Executable
  4. Submit Your Job to the Queue
  5. Check the Status of Your Jobs
  6. Delete a Job
  7. Where is the Output?
  8. Submit Multiple Jobs

Hardin and seldon (the interactive computers to which you connect using SSH) are not the computational workhorses of the cluster. The main computing power of the cluster lies in the additional 32 processors which are available to programs submitted to the batch queue. These instructions will show you how your programs can use that power.

The job queue is like a valet:

You give it brief instuctions (a shell script) telling what program to run.
 
It waits until one of the batch cluster processors is free,
— runs the program on that processor until it is finished,
— writes out errors and output from your programs,
— sends you email notifications at start and finish, if you wish.

Up to 8 of your jobs can be executing at a time, when resources are available. Each job will have exclusive use of one processor and up to 512MB of memory unless you specify otherwise.

Design your jobs to be rerun. Do not change the files your job will be using before the job finishes. Batch jobs may be rerun for a variety of reasons -- priority decisions, node failure, or administrative maintenance.

Jobs in an execution queue will be preempted (either suspended -- temporarily stopped from execution, or actually terminated -- ending the job and putting it back into the input queue) by the scheduler if more than 4 of your jobs are running at one time and another job becomes more deserving for execution as determined by a fairshare algorithm. Of the jobs eligible to be preempted, the job having used the least amount of CPU time will be selected.

Preempted jobs gain priority over time, and they will be put into execution again, possibly preempting others' jobs.

You can improve the thruput of your jobs, and possibly avoid preemption, if you limit the maximum amount of CPU time your job will use by specifying that limit when you submit the job. The scheduler will factor this limit into the fairshare decision-making process. If you do not specify a CPU time limit, the scheduler will assume your job will run for one month.

 

Create a Shell Script for Your Job

Back to top

A shell script is a set of instructions that the cluster node needs to find and run your program. It's a simple text file (usually with a .txt extension). You can create it with the nano editor on an interactive system such as hardin, seldon or mule2.

To create a shell script called myprog.txt, at the prompt type in:

[abc123@seldon abc123]$ nano myprog.txt
When you are finished typing, press <Ctrl>-X to exit the nano editor and Y to save the changes.

To run one program you need a five-line shell script like this:

#!/bin/bash
#PBS -j oe

cd ~/myprograms
matlab -r 'myprog'

The first line #!/bin/bash specifies which shell program to use, is mandatory and does not change.

The second line #PBS -j oe tells PBS to join standard output and standard error together in the output file that is delivered back to the submitting directory. Note that there is a space between the join option (-j) and its values (oe).

An optional line following the second line tells PBS the maximum amount of CPU time the job will take. CPU time is specified in hours:minutes:seconds, so one hour would be written as "1:00:00". The following PBS directive limits (using the -l option) a job to three hours of CPU time. Place it immediately after line 2 of your shell script if you want to use it:

#PBS -l cput=3:00:00

The third line of the example script is blank. It makes the script more readable.

The fourth line cd ~/myprograms tells the cluster node to change working directory to ~/myprograms (all the nodes use the same storage system for your home directory). The tilde sign ~ is a shortcut to your home directory.

The last line tells MATLAB to start in batch mode and run myprog.m located in your myprograms directory. Commands for running in batch mode differ among program applications.

 

Application Line: MATLAB

Back to top

matlab -nodisplay -r 'commands'
Runs MATLAB commands or your own M-files from the working directory. Separate multiple commands with commas or semicolons (;). Do not include the pathname or a file extension (.m) to run an M-file. Put quotes around your list of commands.

You can pass parameters to your M-file using this syntax, for example:

matlab -nodisplay -r 'myprog(3.8, 0.2, 2.5)'

If you are submitting multiple jobs which execute the same MATLAB program with different parameters, you need a way to distinguish the output files. You can do this simply by printing the parameter values in the beginning of your MATLAB code. You can also redirect MATLAB output into a log file with a name that contains the parameters. This command

matlab -r 'myprog(3.8, 0.2, 2.5)' > myprog_3.8_0.2_2.5.log
will save MATLAB output in a file called myprog_3.8_0.2_2.5.log.

An alternative MATLAB command is:

matlab -nodisplay < myprog.m

The left arrow feeds myprog.m into MATLAB line by line, as if you were typing it in. You cannot pass parameters to your program using this syntax.

You can also redirect the output to a special log file by adding ">filename.log" to the command:

matlab -nodisplay < myprog.m > filename.log

 

Application Line: Stata

Back to top

stata -b do myprog.do
Stata will write its output to myprog.log in the working directory.

 

Application Line: SAS

Back to top

sas myprog.sas
SAS will write a log file named myprog.log and an output file with results myprog.lst.

 

Application Line: R

Back to top

R CMD BATCH myprog.R myprog.log
or
R --no-save < myprog.R > myprog.log
R will write a log file myprog.log.

 

Application Line: Splus

Back to top

Splus BATCH input_file output_file

Splus reads the program from input_file and writes the results to output_file.

 

Application Line: GAUSS

Back to top

gauss -b finance.e > finance.lst
GAUSS writes its results to standard output, which in this case is redirected to the file named finance.lst

 

Application Line: Ox

Back to top

oxl finance.ox > finance.lst
or
oxl finance.oxo > finance.lst
Ox writes its results to standard output, which in this case is redirected to the file named finance.lst

 

Application Line: Singular

Back to top

Singular -t -q < adjoint.sing > adjoint.lst
Singular writes its results to standard output, which in this case is redirected to the file named adjoint.lst


For other programs, see Statistical Software Manuals

 

Make the Script Executable

Back to top

You should make your shell script executable and test it before you submit it to the queue. At the prompt, type in:

chmod u+x myprog.txt

You can change permissions on multiple shell scripts located in the same directory at once:

chmod u+x *.txt

Test your script by running it (you can abort it by typing <Ctrl>-C):

./myprog.txt

Remember to clean up any unwanted files your script may have created when you tested it.

 

Submit Your Job to the Queue

Back to top

The qsub command sends your jobs for execution on cluster nodes:

qsub -m abe -N jobname myprog.txt

-N jobname specifies the name of your job that will show in the queue. You can use any name up to 15 characters long without spaces starting with a letter. You can omit the -N option and your job will have the same name as your shell script.

The letters following -m specify what email updates you will receive about your job:

n: no mail.
a: mail is sent when the job is aborted by the batch system.
b: mail is sent when the job begins execution.
e: mail is sent when the job ends.

If the -m option is not specified, mail will be sent only if the job is aborted (same as -m a).

The last parameter (do not omit it) is the name of your shell script file.

 

Check the Status of Your Jobs

Back to top

To check the status of your jobs, type in qstat at the prompt. It will display the status of the entire job queue:

[abc123@seldon abc123]$ qstat
Job id           Name             User             Time Use S Queue
---------------- ---------------- ---------------- -------- - -----
5002.seldon      Job52            abc123            55:15:0 R A
5031.seldon      simul007         def456            18:45:4 R A
   ...
5068.seldon      m331             xyz987                  0 Q A

The Name column lists job names assigned in the qsub command. Give different names to your jobs to distinguish them from each other in the job queue.

The S column indicates the job state:

E - Job is exiting after having run.
H - Job is held.
Q - job is queued, eligible to run or routed.
R - job is running.
T - job is being moved to new location.
W - job is waiting for its execution time (-a option) to be reached.
S - job is suspended.

pbstat condenses and interprets the output of qstat -f to display more readable information about PBS jobs. With no arguments, pbstat will display information only about your own jobs. You can specify another username or the keyword all as a command argument to display PBS jobs owned by other users.

[abc123@seldon abc123]$ pbstat
------------------------------------------------------
PBS Job ID number      :  7364.seldon
Job owner              :  abc123@seldon.it.northwestern.edu
Job name               :  stage1.run
Job started on         :  Tue Oct 23 08:16:08 2006
Job status             :  Running
Mail Points            :  a
PBS queue and server   :  A on seldon
Job is running on      :  node21:mem=524228kb;ncpus=1
# of CPUs being used   :  1
CPU utilization        :  98% (ideal max is 100%)
Elapsed walltime       :  08:13:35 (max is 672:00:00)
Elapsed CPU time       :  08:13:00 (max is 672:00:00)
Memory usage           :  105.5 MB
VMemory usage          :  818.1 MB

 

Delete a Job

Back to top

If you need to remove your job from the queue before it starts, or if you want to terminate an already running batch job, type in:

qdel job_id
job_id is the number listed in the first column of qstat output.

 

Where is the Output?

Back to top

When a job is finished you will see a new file in the directory from which you typed in the qsub command. It has a .oXXXX extension and contains standard (text) output of the program.

You may also see a similar file with the .eXXXX extension, which contains the standard error output of your job. XXXX in both cases is the job_id.

You should use the #PBS -j oe command to join standard output and standard error for your job. That's explained in Create a Shell Script for Your Job, above.

Only text output is automatically saved in the log file. If your program produces graphs you need to add instructions to your program to save those graphs to disk in a file. In MATLAB this is done by the saveas function.

Computer and Network Security

E-mail, NetID, and Password

Hardware

Listserv

Network Services

NUTV and TV Services

Policies and Guidelines

Reserve a Facility

Service Status

Software

Telephone Services

Videoconferencing Services

Web Publishing Services

Webcasting

Webmail

Off-campus Connections

Safe access to the NU Network (VPN)

Wired Connection

Wireless access

Departmental Desktop and Server Support

NUIT Help

Student Support

Computer Labs

Course Management System (Blackboard)

Learning Opportunities

Smart Classrooms

about NUIT

Job Opportunities in NUIT

News, Press, and Publications

What's New & Changing with Technology @ NU?