8. Job scripts

Job scripts are the main workhorses in an exax project. Most computations are carried out in there, and if the project is partitioned into several job scripts, only those affected by code changes will have to be re-built when re-executing the project’s build script.

8.1. Building Job Scripts

Job scripts are built using a build() call, either from a build script like this

Building my_script from a build script.
def main(urd):
    job = urd.build('my_script', x=3)

or from another job script (as a “subjob”) like this

Building my_script from another job script.
from accelerator.subjobs import build

def synthesis():
    job = build('my_script', x=4)

Unlike build scripts, job scripts are not executable from the command line.

8.2. Existing Jobs Will be Re-Used Whenever Possible

The first thing that happens in the build-call is that the job script’s source code and input parameters are compared to what already exists in the project workdirs. Then, one out of two things will happen

  1. The combination of source code and input parameters has not been seen before, and therefore the job script is built and a new job directory is created. When execution finishes, the return value from the build call is a Job object containing references to the new job.

  2. A matching job directory already exists, and the build call immediately returns a Job object containing references to the existing job.

The job directory will contain all input and output relating to the build, and also meta information about the execution itself. Here is a more or less complete list of what is saved

  • the job scripts’s source code

  • input parameters

  • build timestamp

  • profiling information

  • Python version

  • exax version

  • job id of the builder

  • input directory

  • method package

  • any other files the job “depends extra” on

(The build script has additional support functionality, such as JobLists, the Urd database, and result linking, to aid development, whereas building a subjob has less decorations.)

8.3. Passing Input Parameters

There are three kinds of input paramters to a job script: options, datasets, and jobs. They are all declared early in the job script’s source file, see the following example

Example of all three types of input parameters speficied in a job script.
options = dict(
              x=3,
              key='protomolecule',
              f = float,
          )
datasets = ('source_dataset', 'anothersourceds',)
jobs = ('previous_job', ['a_list_of_jobs',],)

Note

datasets and jobs are tuples, and therefore it is key to remember to add a comma after any single item like the jobs=('previous_job',) assignment above. Otherwise it will be interpreted as a string or characters, and things will break.

  • The options parameter is a dictionary, that can take almost “anything”, with or without default values and type definitions.

  • The datasets parameter is a list or tuple of datasets references.

  • The jobs parameter is similar to datasets, but contains a list of job references.

  • Both datasets and jobs can also takes lists as input.

Parameters are assigned by the build call like this:

Assigning input parameters to a build.
urd.build('my_script',
      x=37,
      key='some string',
      f=42.0
      source=ds,
      previous=job0,
      a_list_of_jobs=[job0, job1, job2],
)

Note

In the example above, all parameters have unique names, so it is not necessary to specify if, say, x is an option, a dataset, or a job.

If names are not unique, it is possible to explicitly state the kind of parameter using ..., datasets={'source': ds},... and so on.

Note

The build()-call consumes a few parameters that are not forwarded to the job script. The most common one is name.

An input option can not have the name name for this reason!

8.4. Receiving Input Parameters

Inside the job script, parameters are available like in the following example

Print some input parameters to stdout.
options={'x': 37, 'name': 'myname',}
datasets=('ds',)
jobs=('previous',)

def synthesis():
    print(options.x, options.name)
    print(datasets.ds.columns)
    print(jobs.previous)

In a running job script, all three parameter types are converted to the accelerator.DotDict type, which is basically a Python dict supporting dot-notation for accessing its values.

Tip

Input parameters members can be accessed using dot notation, like options.x etc.

8.5. Options: Default Values and Typing

If an option is defined with a value (such as options=dict(x=37)), this value is also the default value that will be used if none is assigned by the build call. The default value also affects the typing. A default value of 37 will not match a string, for example, but it will match a float.

If instead the option is specified using a type, (such as options=dict(f=float)), the input parameter must be of the same type. If the input parameter is left unspecified in this case, the (default) value will be None.

Note

  • If a default value is set, this value will be used if left unassigned.

  • A default value also specifies the allowed set of types of the input.

  • If the default value is a type, this is the only allowed type.

  • If the default value is a type and left unassigned, its value will become None.

8.6. Execution and Data Flow

There are three functions used for code execution in a job script, of which at least one is mandatory. They are, listed in execution order

  • prepare()

  • analysis()

  • synthesis()

The functions will be described below in reverse order, starting with synthesis(), since this is the simplest and most commonly used for more basic job scripts.

8.6.1. synthesis()

The synthesis() function is executed as a single process, and its return value is stored persistently as the job’s output value, like shown in this example:

This is job script a_test.py
options = dict(x=3)
def synthesis()
    val = options.x * 2
    return dict(value=val, caption="this is a test")

When the job has completed execution, the return value is conveniently available using the returned object’s load() function, like this

…and a corresponding build script build_mytest.py to build it.
def main(urd):
    job = urd.build('test', x=10)
    data = job.load()
    print(data['value'])

If this is executed using ax run mytest, the build script will execute the job script test and print the value “20” to standard output.

8.6.2. analysis()

The analysis() function is intended for parallel processing. When run, it is forked into a number of parallel processes, called slices. The number of slices is fixed and specified in the configuration file:

Part of accelerator.conf specifying number of parallel processes.
slices: 64

This can be set to any number at project initialisation, and it is then the same fixed number for the whole project. The ax init command will by default initiate this to the number of available cores on the machine. (It makes little sense to set it to a larger number, but in some cases a lower number is preferred in order to limit the max load on the machine.)

The number of slices, as well as the current fork number sliceno ranging from zero to slices minus one) are available as parameters to the analysis() function.

Example of analysis() function.
def analysis(sliceno, slices):
    print('This is slice %d/%d' % (sliceno, slices))
    return sliceno * sliceno

Note

sliceno is a mandatory parameter to analysis(). slices is not.

When all forks have completed execution, the return value from all analysis() calls become available to the synthesis() function (described earlier) as the analysis_res input parameter. analysis_res is an iterator, containing one element per analysis process. It also has a convenient class method for merging all results together, like this

Use of analysis_res and its automagic result merger merge_auto().
def synthesis(analysis_res):
    x = analysis_res.merge_auto()

merge_auto() typically does what is expected (but is of course not mandatory to use). In the example above, the returned integers from analysis() will be added together into one number. It will merge sets or dictionaries, update Counters, etc.

8.6.3. prepare()

The prepare() function, if present, is executed first, and just like synthesis() it runs in a single process. The main reason for prepare() is to simplify any preparation work like setting up datastructures and datasets prior to parallel processing in the analysis() function. If no parallel processing is required, it is encouraged to use just synthesis() instead of prepare().

The return value from prepare() is available to both analysis() and synthesis() as prepare_res, like this

prepare_res example
def prepare(job):
    dw = job.datasetwriter()
    dw.add('index', 'number')
    return dw

def analysis(sliceno, prepare_res):
    dw = prepare_res
    for ix in range(10):
        dw.write(ix)

8.6.4. Function Inputs and Outputs

As shown in the previous section,
  • analysis_res is available to synthesis(), and

  • prepare_res is available to both analysis() and synthesis().

In addition, analysis() has access to the sliceno and slices parameters, and all three functions have access to the job object that contains a set of useful job-related helper functions.

Return values from prepare() and analysis() are stored temporarily in the job directory by default, and removed upon job completion. In contrast, the return value from synthesis() is stored persistently and considered to be the default output from the job.

8.7. Share Data using Return Value

The simplest way to share data between a job script and another job or build script is to use the return value.

To make data created by a job script available elsewhere, just return it:

Example of return value.
def synthesis():
    data = ...
    return data

Then this data is available in a build script like this

Return value from job script into a build script.
def main(urd):
    job = urd.build('scriptreturndata')
    data = job.load()

Similarly, to access the data in another job script

Return value from one job script to another.
jobs=('jobreturndata',)
def synthesis():
    data = jobs.jobreturndata.load()

assuming it was provided by the build script

Corresponding build script passing the first job as input to the second.
def main(urd):
    job = urd.build('scriptreturndata')
    urd.build('scriptusingdata', jobreturndata=job)

Note how easy it is to share files between jobs. Also, if the return value is used there is no need for arbitrary filenames at all. And even if filenames are used, the correct file will be looked up by exax using a combination of filename, job input parameters, and source code, so there is no need to manually keep track of different versions of output files, they all share the same filename.

8.8. Writing Files

Any file written by a job is stored in the current job directory. This is also where the source code and input parameters to the current build are stored. Keeping everything at one place ensures that the relationship between input, source code, and output is always clear.

Note

Files created by a job are and should always be stored in the corresponding job directory. By default, the current working directory is set to the current job directory when the job script is executing to simplify this. Avoiding filenames with absolute paths will ensure that the files end up the current job directory.

Files can be created by any means, but it is encouraged to use the built-in helper functions that among other things will create files in the correct location. These functions will also register the files, which is the topic of the next section.

The first helper finction is job.save(). This stores data as a Python pickle file:

Writing a pickle file.
def synthesis(job):
    data = ...
    job.save(data, 'thisisthenameofapicklefile')

There is also a dedicated function for writing json files:

Writing a json file.
def synthesis(job):
    data = ...
    job.json_save(data, 'andthisisajsonfile')

In addition, there is a generic job.open() function as well, that is a wrapper around Python’s open() function:

Use of job.open().
def synthesis(job):
    data = ...
    with job.open('thefilename', 'wt') as fh:
        fh.write(data)

Note

Reading and writing files in analysis() is special, because this function is running as several parallel processes. For this reason, it is possible to work with sliced files, simply meaning that one “filename” in the program corresponds to a set of files on disk, one for each process.

This is handled using save(..., sliceno=sliceno).

In addition, it is possible to create temporary files that only exists during the execution of the job script and will be automatically deleted upon job completion. This might be useful for huge temporary files if disk space is a major concern. Add the parameter temp=True to job.save() or job.json_save() to make the file temporary.

Tip

It is possible to create “parallel” files in analysis. A parallel file is a set of files, one per slice, that is associated with a single filename by exax.

The basic idea is that one can do

def analysis(sliceno):
    data = ...

    job.save(data, 'filename', sliceno=sliceno)   # save per slice
    data = job.load('filename', sliceno=sliceno)  # load per slice

8.9. Registering Files

Registering a file means making exax aware of it, so that simple helper functions can list and retrieve the data directly from a job object. For example, registered files can be listed using job.files(), and accessed using job.open() or job.json_load(). Registered files are also trivially added to the exax Board web server for visual inspection.

Almost all created files are registered automatically by default when the job script finishes execution. Files in subdirectories is the exception, they are not automatically registered.

Note

Files in subdirectories are not registered automatically.

Files can also be registered manually. Manual registration does, however, turn off automatic registration for all files. Registration is either manual or automatic.

Note

If a file is manually registered, automatic registration is disabled for all other files, so they have to be registered manually too, if needed.

Calls to job.save(), job.json_save(), and job.open() will register the created file, and turn off automatic registration of all other files. This is a reasonable default.

To register a file manually, use job.register_file(), for example like this, when the file has been created by an external command:

Register a file created by external program.
def synthesis(job):
    # use external program ffmpeg to generate a movie file "out.mp4"
    subprocess.run(['ffmpeg', ..., 'out.mp4'])
    job.register_file('out.mp4')

Several files could be registered at once using glob patterns, like this

Registering a file using job.register_files()
def synthesis(job):
    # create file "myfile1.txt", "myfile2.txt", ..., "myfile10.txt"
    job.register_files("myfile*.txt")

Note

The call job.register_files() will return a set containing the names of all files that were registered!

Temporary files are not registered, even though they are created by the helper functions. On the other hand, if a temporary file is being registered manually, it stops being temporary.

8.10. Find and Load Created Files

Files in a job are easily accessible by other job scripts and build scripts, see this example where data created in a job is read back into the running build script. The example assumes the files are registered, but this is not a requirement.

Writing and reading files
 # in the job script "a_methodthatsavefiles.py"
 def synthesis(job):
     ...
     job.save(data1, 'afilename')
     job.save(data2, 'anotherfilename')

 # in the build script "build.py"
 def main(urd):
     job = urd.build('methodthatsavefiles')
     data = {}
     for filename in job.files:
         data[filename] = job.load(filename)

There is also a job.json_load() function to directly load json content. Note that exax has no idea what if it is json or pickle or something else. Make sure to use the proper functions.

The names of a job’s all registered files are available using job.files(). This call will return a set of all filenames in the job. The absolute path of a particular file can be retrieved using the job.filename() function, like this

Find files created by a job.
 def main(urd):
     job = urd.build('my_script', ...)
     print(job.files())
     print(job.filename('myfile'))

Note

There is no need to use absolute paths with exax. Absolute paths should in fact be avoided, since they prevent moving things around in the file system later.

But it is nice to know that it is very easy to find any file generated in an exax project.

Tip

Files can also be listed and viewed in exax Board using a web browser.

Tip

The ax job shell command can also list and view files in a job.

8.11. Reading Input Files

Input data files should ideally be stored in the input directory specified in the configuration file. If so, input files could be addressed using a relative path, and therefore be moved around in the file system without causing any changes to the project code.

There are three helper functions for input data:

reading input files
# Returns the path to the input directory.
job.input_directory()

# Returns the full path to a specific file in the input directory.
# Multiple arguments will be fed to Python's os.path.join()
job.input_filename('thefile')
job.input_filename('or', 'a', 'path', 'to', 'thefile')

# Opens a file in the input directory for reading.
# (This is a wrapper around Python's open() function.)
fh = job.open_input('thefile', 'rb')

Tip

Use the input_directory and corresponding helper functions to avoid having absolute paths in your project code!

8.12. Adding a Description

A text description is added to a job script using the description variable. This description is visible in exax Board and using the ax method command, and it looks like this

Example of description
description="""Collect movie ratings.

Movie ratings are collected using a parallel interation
over all...
"""

Tip

Use ax method or exax Board to see descriptions of all available job scripts.

Descriptions work much like git commit messages. If the description is multi-lined, the first row is a short description that will be shown when typing ax method to list all job scripts and their short descriptions. A detailed description may follow on consecutive lines, and it will be shown when doing ax method <a particular job script>.

8.13. Retrieving stdout and stderr

Everything written to stdout and stderr (using for example plain print()-statements) is always stored persistently in the job directory. Use the job object’s output() function to access it

Show what the job printed to the terminal
def main(urd):
    job = urd.build('my_script')
    print(job.output())              # contains both stdout and stderr

It can also be retreived using the ax job command, for example like this

ax job print stdout and stderr
 ax job test-43 -O

And it is straightforward to view the output in the Board web server as well.

8.13.1. Output retreieval in more detail

The combined stdout and stderr output is stored in the job directory like this

job-x/
  OUTPUT/
    prepare     # created if any output in prepare()
    synthesis   #                          synthesis()
    0           #                          analysis() slice 0
    3           #                          analysis() slice 3

Note that no empty files will be created.

It is possible to access any part of the output, like shown in the following examples

job.output()             # everything
job.output('prepare')
job.output('synthesis')
job.output(0)
job.output(3)

8.14. Progress/status reporting

If a job takes a long time to complete, pressing CTRL+T will force exax to print a message on stdout. This message can be tailored to the running program in the following way

custom status messages (shown when pressing CTRL+T)
from accelerator import status

def synthesis():
    msg = "my status message: %s"
    with status(msg % ('init',) as update:
        for task in tasklist:
            update(msg % (task,))

In this example, the status message will update for each new task in the tasklist. The output message will automatically add execution time, if it is running in prepare, analysis, or synthesis, and when in analysis also provide information about which slice the the message belongs to. It may for example look like this

589443 STATUS:      analysis(2) (9.0 seconds)
589443 STATUS:         my status message: task_number_one

or

589443 STATUS:      synthesis (14.1 seconds)
589443 STATUS:         my status message: the_synthesis_run

Tip

Exax Dataset iterators use status reporting to tell which Dataset in a Dataset chain it is currently working on.

8.15. Subjobs

Job scripts are typically built by build scripts, but in a similar way job scripts can be built by other job scripts. There is no difference from a built job’s perspective, but the nomenclature is that when a job script is building a job it is called a subjob.

Subjobs are built in the synthesis() function like this

Building a job from within a job.
from accelerator.subjobs import build

def synthesis():
    job = build('my_script')

The subjobs.build() call uses the same input parameters and syntax as the urd.build() call in a build scripts. Similarly, the returned job object is an instance of the Job class that contains some useful helper functionality.

Note

Subjobs are not visible in build scripts and do not show up in urd.joblist! Furthermore, they are not recorded in the urd database.

Subjobs are registered in the post-data of a job and can be retrieved by inspecting job.post.subjobs.

8.15.1. Subjobs and Datasets

Datasets created by subjobs can be made available to the job that built the subjob, to make it look like the dataset was created there. It works as shown in the following example

Link a subjob’s dataset to the current job.
from accelerator import subjobs

def synthesis():
    job = subjobs.build('create_a_dataset')
    ds = job.dataset(<name>)
    ds = ds.link_to_here(name=<anothername>)

In the example above, the job script create_a_dataset creates a dataset. A reference to this dataset is created using the job.dataset() function. Finally, using the ds.link_to_here() function, a soft link is created in the current job directory, pointing to the job directory of the subjob, completing the illusion that the dataset is created by the current job script.

Similarly, it is possible to override the dataset’s previous, like so

Override a subjob’s dataset’s previous
 ...
 ds = ds.link_to_here(name=<anothername>, override_previous=<some dataset>)

The ds_link_to_here() function returns a reference to the “new” linked dataset.