8. Job scripts¶
Job scripts are the main workhorses in an exax project. Most computations are carried out in there, and if the project is partitioned into several job scripts, only those affected by code changes will have to be re-built when re-executing the project’s build script.
8.1. Building Job Scripts¶
Job scripts are built using a build() call, either from a build
script like this
my_script from a build script.¶def main(urd):
job = urd.build('my_script', x=3)
or from another job script (as a “subjob”) like this
my_script from another job script.¶from accelerator.subjobs import build
def synthesis():
job = build('my_script', x=4)
Unlike build scripts, job scripts are not executable from the command line.
8.2. Existing Jobs Will be Re-Used Whenever Possible¶
The first thing that happens in the build-call is that the job script’s source code and input parameters are compared to what already exists in the project workdirs. Then, one out of two things will happen
The combination of source code and input parameters has not been seen before, and therefore the job script is built and a new job directory is created. When execution finishes, the return value from the build call is a Job object containing references to the new job.
A matching job directory already exists, and the build call immediately returns a Job object containing references to the existing job.
The job directory will contain all input and output relating to the build, and also meta information about the execution itself. Here is a more or less complete list of what is saved
the job scripts’s source code
input parameters
build timestamp
profiling information
Python version
exax version
job id of the builder
input directory
method package
any other files the job “depends extra” on
(The build script has additional support functionality, such as JobLists, the Urd database, and result linking, to aid development, whereas building a subjob has less decorations.)
8.3. Passing Input Parameters¶
There are three kinds of input paramters to a job script: options, datasets, and jobs. They are all declared early in the job script’s source file, see the following example
options = dict(
x=3,
key='protomolecule',
f = float,
)
datasets = ('source_dataset', 'anothersourceds',)
jobs = ('previous_job', ['a_list_of_jobs',],)
Note
datasets and jobs are tuples, and therefore it is
key to remember to add a comma after any single item like the
jobs=('previous_job',) assignment above. Otherwise it will be
interpreted as a string or characters, and things will break.
The
optionsparameter is a dictionary, that can take almost “anything”, with or without default values and type definitions.The
datasetsparameter is a list or tuple of datasets references.The
jobsparameter is similar todatasets, but contains a list of job references.Both
datasetsandjobscan also takes lists as input.
Parameters are assigned by the build call like this:
urd.build('my_script',
x=37,
key='some string',
f=42.0
source=ds,
previous=job0,
a_list_of_jobs=[job0, job1, job2],
)
Note
In the example above, all parameters have unique names, so
it is not necessary to specify if, say, x is an option,
a dataset, or a job.
If names are not unique, it is possible to explicitly state
the kind of parameter using ..., datasets={'source': ds},... and so on.
Note
The build()-call consumes a few parameters that are not
forwarded to the job script. The most
common one is name.
An input option can not have the name name for this reason!
8.4. Receiving Input Parameters¶
Inside the job script, parameters are available like in the following example
options={'x': 37, 'name': 'myname',}
datasets=('ds',)
jobs=('previous',)
def synthesis():
print(options.x, options.name)
print(datasets.ds.columns)
print(jobs.previous)
In a running job script, all three parameter types are converted to
the accelerator.DotDict type, which is basically a Python dict
supporting dot-notation for accessing its values.
Tip
Input parameters members can be accessed using dot notation,
like options.x etc.
8.5. Options: Default Values and Typing¶
If an option is defined with a value (such as
options=dict(x=37)), this value is also the default value that
will be used if none is assigned by the build call. The default value
also affects the typing. A default value of 37 will not match a
string, for example, but it will match a float.
If instead the option is specified using a type, (such as
options=dict(f=float)), the input parameter must be of the same type.
If the input parameter is left unspecified in this case, the (default)
value will be None.
Note
If a default value is set, this value will be used if left unassigned.
A default value also specifies the allowed set of types of the input.
If the default value is a type, this is the only allowed type.
If the default value is a type and left unassigned, its value will become
None.
8.6. Execution and Data Flow¶
There are three functions used for code execution in a job script, of which at least one is mandatory. They are, listed in execution order
prepare()
analysis()
synthesis()
The functions will be described below in reverse order, starting with
synthesis(), since this is the simplest and most commonly used for
more basic job scripts.
8.6.1. synthesis()¶
The synthesis() function is executed as a single process, and its
return value is stored persistently as the job’s output value, like
shown in this example:
a_test.py…¶options = dict(x=3)
def synthesis()
val = options.x * 2
return dict(value=val, caption="this is a test")
When the job has completed execution, the return value is conveniently
available using the returned object’s load() function, like this
build_mytest.py to build it.¶def main(urd):
job = urd.build('test', x=10)
data = job.load()
print(data['value'])
If this is executed using ax run mytest, the build script will
execute the job script test and print the value “20” to standard
output.
8.6.2. analysis()¶
The analysis() function is intended for parallel processing. When
run, it is forked into a number of parallel processes, called
slices. The number of slices is fixed and specified in the
configuration file:
accelerator.conf specifying number of parallel processes.¶slices: 64
This can be set to any number at project initialisation, and it is
then the same fixed number for the whole project. The ax init
command will by default initiate this to the number of available cores
on the machine. (It makes little sense to set it to a larger number,
but in some cases a lower number is preferred in order to limit the
max load on the machine.)
The number of slices, as well as the current fork number sliceno
ranging from zero to slices minus one) are available as parameters
to the analysis() function.
analysis() function.¶def analysis(sliceno, slices):
print('This is slice %d/%d' % (sliceno, slices))
return sliceno * sliceno
Note
sliceno is a mandatory parameter to analysis(). slices is not.
When all forks have completed execution, the return value from all
analysis() calls become available to the synthesis() function
(described earlier) as the analysis_res input parameter.
analysis_res is an iterator, containing one element per analysis
process. It also has a convenient class method for merging all
results together, like this
analysis_res and its automagic result merger merge_auto().¶def synthesis(analysis_res):
x = analysis_res.merge_auto()
merge_auto() typically does what is expected (but is of course not
mandatory to use). In the example above, the returned integers from
analysis() will be added together into one number. It will merge
sets or dictionaries, update Counters, etc.
8.6.3. prepare()¶
The prepare() function, if present, is executed first, and just
like synthesis() it runs in a single process. The main reason for
prepare() is to simplify any preparation work like setting up
datastructures and datasets prior to parallel processing in the
analysis() function. If no parallel processing is required, it is
encouraged to use just synthesis() instead of prepare().
The return value from prepare() is available to both
analysis() and synthesis() as prepare_res, like this
prepare_res example¶def prepare(job):
dw = job.datasetwriter()
dw.add('index', 'number')
return dw
def analysis(sliceno, prepare_res):
dw = prepare_res
for ix in range(10):
dw.write(ix)
8.6.4. Function Inputs and Outputs¶
- As shown in the previous section,
analysis_resis available tosynthesis(), andprepare_resis available to bothanalysis()andsynthesis().
In addition, analysis() has access to the sliceno and
slices parameters, and all three functions have access to the
job object that contains a set of useful job-related helper
functions.
Return values from prepare() and analysis() are stored
temporarily in the job directory by default, and removed upon job
completion. In contrast, the return value from synthesis() is
stored persistently and considered to be the default output from the
job.
8.8. Writing Files¶
Any file written by a job is stored in the current job directory. This is also where the source code and input parameters to the current build are stored. Keeping everything at one place ensures that the relationship between input, source code, and output is always clear.
Note
Files created by a job are and should always be stored in the corresponding job directory. By default, the current working directory is set to the current job directory when the job script is executing to simplify this. Avoiding filenames with absolute paths will ensure that the files end up the current job directory.
Files can be created by any means, but it is encouraged to use the built-in helper functions that among other things will create files in the correct location. These functions will also register the files, which is the topic of the next section.
The first helper finction is job.save(). This stores data as a
Python pickle file:
def synthesis(job):
data = ...
job.save(data, 'thisisthenameofapicklefile')
There is also a dedicated function for writing json files:
def synthesis(job):
data = ...
job.json_save(data, 'andthisisajsonfile')
In addition, there is a generic job.open() function as well, that
is a wrapper around Python’s open() function:
job.open().¶def synthesis(job):
data = ...
with job.open('thefilename', 'wt') as fh:
fh.write(data)
Note
Reading and writing files in analysis() is special,
because this function is running as several parallel processes. For
this reason, it is possible to work with sliced files, simply
meaning that one “filename” in the program corresponds to a set of
files on disk, one for each process.
This is handled using save(..., sliceno=sliceno).
In addition, it is possible to create temporary files that only
exists during the execution of the job script and will be
automatically deleted upon job completion. This might be useful for
huge temporary files if disk space is a major concern. Add the
parameter temp=True to job.save() or job.json_save() to
make the file temporary.
Tip
It is possible to create “parallel” files in analysis. A
parallel file is a set of files, one per slice, that is
associated with a single filename by exax.
The basic idea is that one can do
def analysis(sliceno):
data = ...
job.save(data, 'filename', sliceno=sliceno) # save per slice
data = job.load('filename', sliceno=sliceno) # load per slice
8.9. Registering Files¶
Registering a file means making exax aware of it, so that simple
helper functions can list and retrieve the data directly from a job
object. For example, registered files can be listed using
job.files(), and accessed using job.open() or
job.json_load(). Registered files are also trivially added to the
exax Board web server for visual inspection.
Almost all created files are registered automatically by default when the job script finishes execution. Files in subdirectories is the exception, they are not automatically registered.
Note
Files in subdirectories are not registered automatically.
Files can also be registered manually. Manual registration does, however, turn off automatic registration for all files. Registration is either manual or automatic.
Note
If a file is manually registered, automatic registration is disabled for all other files, so they have to be registered manually too, if needed.
Calls to job.save(), job.json_save(), and job.open() will
register the created file, and turn off automatic registration of
all other files. This is a reasonable default.
To register a file manually, use job.register_file(), for example
like this, when the file has been created by an external command:
def synthesis(job):
# use external program ffmpeg to generate a movie file "out.mp4"
subprocess.run(['ffmpeg', ..., 'out.mp4'])
job.register_file('out.mp4')
Several files could be registered at once using glob patterns, like this
job.register_files()¶def synthesis(job):
# create file "myfile1.txt", "myfile2.txt", ..., "myfile10.txt"
job.register_files("myfile*.txt")
Note
The call job.register_files() will return a set containing the names of all files that were registered!
Temporary files are not registered, even though they are created by the helper functions. On the other hand, if a temporary file is being registered manually, it stops being temporary.
8.10. Find and Load Created Files¶
Files in a job are easily accessible by other job scripts and build scripts, see this example where data created in a job is read back into the running build script. The example assumes the files are registered, but this is not a requirement.
# in the job script "a_methodthatsavefiles.py"
def synthesis(job):
...
job.save(data1, 'afilename')
job.save(data2, 'anotherfilename')
# in the build script "build.py"
def main(urd):
job = urd.build('methodthatsavefiles')
data = {}
for filename in job.files:
data[filename] = job.load(filename)
There is also a job.json_load() function to directly load json
content. Note that exax has no idea what if it is json or pickle or
something else. Make sure to use the proper functions.
The names of a job’s all registered files are available using
job.files(). This call will return a set of all filenames in the
job. The absolute path of a particular file can be retrieved using
the job.filename() function, like this
def main(urd):
job = urd.build('my_script', ...)
print(job.files())
print(job.filename('myfile'))
Note
There is no need to use absolute paths with exax. Absolute paths should in fact be avoided, since they prevent moving things around in the file system later.
But it is nice to know that it is very easy to find any file generated in an exax project.
Tip
Files can also be listed and viewed in exax Board using a web browser.
Tip
The ax job shell command can also list and view files in a job.
8.11. Reading Input Files¶
Input data files should ideally be stored in the input directory
specified in the configuration file. If so, input files could be
addressed using a relative path, and therefore be moved around in the
file system without causing any changes to the project code.
There are three helper functions for input data:
# Returns the path to the input directory.
job.input_directory()
# Returns the full path to a specific file in the input directory.
# Multiple arguments will be fed to Python's os.path.join()
job.input_filename('thefile')
job.input_filename('or', 'a', 'path', 'to', 'thefile')
# Opens a file in the input directory for reading.
# (This is a wrapper around Python's open() function.)
fh = job.open_input('thefile', 'rb')
Tip
Use the input_directory and corresponding helper
functions to avoid having absolute paths in your project code!
8.12. Adding a Description¶
A text description is added to a job script using the description
variable. This description is visible in exax Board and using
the ax method command, and it looks like this
description="""Collect movie ratings.
Movie ratings are collected using a parallel interation
over all...
"""
Tip
Use ax method or exax Board to see descriptions of all
available job scripts.
Descriptions work much like git commit messages. If the description
is multi-lined, the first row is a short description that will be
shown when typing ax method to list all job scripts and their
short descriptions. A detailed description may follow on consecutive
lines, and it will be shown when doing ax method <a particular
job script>.
8.13. Retrieving stdout and stderr¶
Everything written to stdout and stderr (using for example
plain print()-statements) is always stored persistently in the job
directory. Use the job object’s output() function to access it
def main(urd):
job = urd.build('my_script')
print(job.output()) # contains both stdout and stderr
It can also be retreived using the ax job command, for example
like this
ax job print stdout and stderr¶ ax job test-43 -O
And it is straightforward to view the output in the Board web server as well.
8.13.1. Output retreieval in more detail¶
The combined stdout and stderr output is stored in the job directory like this
job-x/
OUTPUT/
prepare # created if any output in prepare()
synthesis # synthesis()
0 # analysis() slice 0
3 # analysis() slice 3
Note that no empty files will be created.
It is possible to access any part of the output, like shown in the following examples
job.output() # everything
job.output('prepare')
job.output('synthesis')
job.output(0)
job.output(3)
8.14. Progress/status reporting¶
If a job takes a long time to complete, pressing CTRL+T will force
exax to print a message on stdout. This message can be tailored to
the running program in the following way
CTRL+T)¶from accelerator import status
def synthesis():
msg = "my status message: %s"
with status(msg % ('init',) as update:
for task in tasklist:
update(msg % (task,))
In this example, the status message will update for each new task in the tasklist. The output message will automatically add execution time, if it is running in prepare, analysis, or synthesis, and when in analysis also provide information about which slice the the message belongs to. It may for example look like this
589443 STATUS: analysis(2) (9.0 seconds)
589443 STATUS: my status message: task_number_one
or
589443 STATUS: synthesis (14.1 seconds)
589443 STATUS: my status message: the_synthesis_run
Tip
Exax Dataset iterators use status reporting to tell which Dataset in a Dataset chain it is currently working on.
8.15. Subjobs¶
Job scripts are typically built by build scripts, but in a similar way job scripts can be built by other job scripts. There is no difference from a built job’s perspective, but the nomenclature is that when a job script is building a job it is called a subjob.
Subjobs are built in the synthesis() function like this
from accelerator.subjobs import build
def synthesis():
job = build('my_script')
The subjobs.build() call uses the same input parameters and syntax
as the urd.build() call in a build scripts. Similarly, the
returned job object is an instance of the Job class that
contains some useful helper functionality.
Note
Subjobs are not visible in build scripts and do not show
up in urd.joblist! Furthermore, they are not recorded in the
urd database.
Subjobs are registered in the post-data of a job and can be retrieved by
inspecting job.post.subjobs.
8.15.1. Subjobs and Datasets¶
Datasets created by subjobs can be made available to the job that built the subjob, to make it look like the dataset was created there. It works as shown in the following example
from accelerator import subjobs
def synthesis():
job = subjobs.build('create_a_dataset')
ds = job.dataset(<name>)
ds = ds.link_to_here(name=<anothername>)
In the example above, the job script create_a_dataset creates a
dataset. A reference to this dataset is created using the
job.dataset() function. Finally, using the ds.link_to_here()
function, a soft link is created in the current job directory,
pointing to the job directory of the subjob, completing the illusion
that the dataset is created by the current job script.
Similarly, it is possible to override the dataset’s previous, like so
...
ds = ds.link_to_here(name=<anothername>, override_previous=<some dataset>)
The ds_link_to_here() function returns a reference to the “new”
linked dataset.