A set of utilities for consistent data management of batch job submission for multiple venues
The batchsubmit command provides access to a comprehensive, secure infrastructure that supports the submission, execution, and return of batch jobs. Batchsubmit was specifically created to run OpenSees batch jobs although other types of jobs can be run if the executable and all supporting files are provided.
Batch jobs can be run either locally on the NEEShub infrastructure or remotely on other platforms (venues) in serial or parallel modes. When run remotely on other venues, the batchsubmit command transparently handles authentication and communication between NEEShub and high performance computing (HPC) venues, as well as efficient transfer of data both ways.
Access to batchsubmit
Because the batchsubmit command is invoked via the Workspace tool, NEEShub users must have the ability to access the Workspace tool. NEEShub users should request access to the Workspace tool by entering a Support ticket.
Additionally, users need to request access in order to use the XSEDE venues of Kraken and Stampede. This access should also be requested by entering a Support ticket.
The batchsubmit command is invoked via the Workspace tool command ‘batchsubmit’. A variety of parameters are available to customize the behavior and performance of jobs submitted through batchsubmit.
Currently available venues are as follows:
- Local (job will execute within the NEEShub infrastructure)
- Kraken (www.xsede.org, NICS, National Institute for Computational Science)
- OSG (OpenScienceGrid)
How to Choose a Venue?
Selecting an appropriate venue to submit your jobs is an important step to guarantee the successful completion of the jobs. Here are some guidelines on choosing the right venue.
Before submitting a job (parallel job) the user should have an idea about how complex the job is. It is recommended that jobs be tested on NEEShub local first before an external venue. Users should have an approximate idea about the number of cores needed to run the job and the expected run time. Once the user determines these two parameters, the venue can be chosen from the following table.
|Venue||Number of Nodes
|Number of Processors per Node (ppn)||Max Number of Processors (ncpus)||Walltime||Comments|
|Local (NEEShub)||1||8 (max)||
16 (max) virtual cores
|Kraken||42 (max)||12 (max)||
|24:00:00 (max)||ncpus must be a multiple of 12|
|Stampede||256 (max)||16 (max)||
|24:00:00 (max)||ncpus must be a multiple of 16|
|Hansen||12 (max)||4 (max)||48 (max)||720:00:00 (max)|
|Carter||4||16||64 (max)||720:00:00 (max)|
An important parameter to note here is the walltime. If the expected running time of the job is more than the given walltime, the user should increase the number of processors so that the execution time is reduced. This will help user to narrow down to a venue as the number of processors is limited for each venue. For each venue the maximum value of ‘ncpus’ determines the total processors available. A user can either specify the ncpus value directly in the batchsubmit command or specify ‘nn’ and/or ‘ppn’ values. It is important to note that when ‘nn’ and ‘ppn’ are specified they should satisfy the minimum and maximum criterion as well as their product should be less than maximum value of ‘ncpus’.
Based on the complexity of jobs we can arrive at the following generalization,
|Job Size||Number of Cores||Venue to be Used|
|Large||> 100||Kraken, Stampede|
|Medium||20 < cores <100||Hansen|
|Small||0 < cores <20||Local (NEEShub)|
NOTE: Due to maintenance or other technical reasons some of the venues listed above may not be available for certain periods of time. Always use the command “batchsubmit --list” to see the venues that are currently available before submitting your job.
Specifying The Location of The Executable
With the new version of batchsubmit, if the user is not using a custom version of the executable, the user need not specify the location of executable using the --appdir parameter. The batchsubmit command will automatically locate the executable and use it. The --appdir parameter is required only if the user wants to use a custom version of the executable.
Batchsubmit Job Completion Notification
Upon completion of a batchsubmit job, NEEShub users will receive email notification to the email address associated to their NEEShub user account. To determine which email account is associated to your NEEShub user account, login to NEEShub and click the myNEEShub link found in the upper right corner of most NEEShub screens.
The initial output area for batchsubmit jobs is called Scratch space. Although Scratch space is an extremely large storage area, it is only a temporary storage area and is not regularly backed up by NEEScomm IT. To ensure long-term access to batchsubmit job output, it is recommended that NEEShub users move batchsubmit job output to a permanent and backed up storage area with the SynchroNEES tool.
Batchsubmit Help documentation
To view the basic help documentation and parameter options for batchsubmit, enter ‘batchsubmit –h | more’ at the command line of the Workspace tool.
To request additional help for use of the batchsubmit command or recommended methods for running parallel jobs, enter a Support ticket.