# Jobs[¶](https://doc.dataiku.com/dss/latest/api/python/jobs.html#jobs "Permalink to this headline")

The API offers methods to retrieve the list of jobs and their status, so that they can be monitored. Additionally, new jobs can be created to build datasets.

## Reading the jobs’ status[¶](https://doc.dataiku.com/dss/latest/api/python/jobs.html#reading-the-jobs-status "Permalink to this headline")

The list of all jobs, finished or not, can be fetched with the list\_jobs() method. For example, to retrieve job failures posterior to a given date:

§ date = '2015/09/24'

§ date\_as\_timestamp = int(datetime.datetime.strptime(date, "%Y/%m/%d").strftime('%s')) \* 1000

§ project = client.get\_project('TEST\_PROJECT')

§ jobs = project.list\_jobs()

§ failed\_jobs = [job for job in jobs if job['state'] == 'FAILED' and job['def']['initiationTimestamp'] >= date\_as\_timestamp]

The method list\_jobs() returns all job information for each job, as a JSON object. Important fields are:

§ {

§ 'def': {   'id': 'build\_cat\_train\_hdfs\_NP\_2015-09-28T09-17-37.455',    # the identifier for the job

§ 'initiationTimestamp': 1443431857455,                      # timestamp of when the job was submitted

§ 'initiator': 'API (aa)',

§ 'mailNotification': False,

§ 'name': 'build\_cat\_train\_hdfs\_NP',

§ 'outputs': [   {   'targetDataset': 'cat\_train\_hdfs',      # the dataset(s) built by the job

§ 'targetDatasetProjectKey': 'IMPALA',

§ 'targetPartition': 'NP',

§ 'type': 'DATASET'}],

§ 'projectKey': 'IMPALA',

§ 'refreshHiveMetastore': False,

§ 'refreshIntermediateMirrors': True,

§ 'refreshTargetMirrors': True,

§ 'triggeredFrom': 'API',

§ 'type': 'NON\_RECURSIVE\_FORCED\_BUILD'},

§ 'endTime': 0,

§ 'stableState': True,

§ 'startTime': 0,

§ 'state': 'ABORTED',                                                    # the stable state of the job

§ 'warningsCount': 0}

The id field is needed to get a handle of the job and call abort() or get\_log() on it.

## Starting new jobs[¶](https://doc.dataiku.com/dss/latest/api/python/jobs.html#starting-new-jobs "Permalink to this headline")

Datasets can be built by creating a job of which they are the output. A job is created by building a job definition and starting it. For a simple non-partitioned dataset, this is done with:

§ project = client.get\_project('TEST\_PROJECT')

§ definition = {

§ "type" : "NON\_RECURSIVE\_FORCED\_BUILD",

§ "outputs" : [{

§ "id" : "dataset\_to\_build",

§ "type": "DATASET",

§ "partition" : "NP"

§ }]

§ }

§ job = project.start\_job(definition)

§ state = ''

§ while state != 'DONE' and state != 'FAILED' and state != 'ABORTED':

§ time.sleep(1)

§ state = job.get\_status()['baseStatus']['state']

§ # done!

The example above uses start\_job() to start a job, and then checks the job state every second until it is complete. Alternatively, the method start\_and\_wait() can be used to start a job and return only after job completion.

The start\_job() method returns a job handle that can be used to later abort the job. Other jobs can be aborted once their id is known. For example, to abort all jobs currently being processed:

§ project = client.get\_project('TEST\_PROJECT')

§ for job in project.list\_jobs():

§ if job['stableState'] == False:

§ project.get\_job(job['def']['id']).abort()

Here’s another example of using `DSSProject.new\_job()` to build a managed folder and the `with\_output` method as an alternative to creating a dictionary job definition:

§ project = client.get\_project('TEST\_PROJECT')

§ # where O2ue6CX3 is the managed folder id

§ job = project.new\_job('RECURSIVE\_FORCED\_BUILD').with\_output('O2ue6CX3', object\_type='MANAGED\_FOLDER')

§ res = job.start\_and\_wait()

§ print(res.get\_status())

## Reference documentation[¶](https://doc.dataiku.com/dss/latest/api/python/jobs.html#reference-documentation "Permalink to this headline")

*class* `dataikuapi.dss.project.``JobDefinitionBuilder`(*project*, *job\_type='NON\_RECURSIVE\_FORCED\_BUILD'*)

Helper to run a job. Do not create this class directly, use `DSSProject.new\_job()`

`with_type`(*job\_type*)

Sets the build type

* Parameters: **job\_type** – the build type for the job RECURSIVE\_BUILD, NON\_RECURSIVE\_FORCED\_BUILD,
RECURSIVE\_FORCED\_BUILD, RECURSIVE\_MISSING\_ONLY\_BUILD

`with_refresh_metastore`(*refresh\_metastore*)

Sets whether the hive tables built by the job should have their definitions refreshed after the corresponding dataset is built

* Parameters: **refresh\_metastore** (*bool*) –

`with_output`(*name*, *object\_type=None*, *object\_project\_key=None*, *partition=None*)

Adds an item to build in this job

* Parameters: * **name** – name of the output object
* **object\_type** – type of object to build from: DATASET, MANAGED\_FOLDER, SAVED\_MODEL, STREAMING\_ENDPOINT
(defaults to **None**)
* **object\_project\_key** – PROJECT\_KEY for the project that contains the object to build (defaults to **None**)
* **partition** – specify partition to build (defaults to **None**)

`get_definition`()

Gets the internal definition for this job

`start`()

Starts the job, and return a dataikuapi.dss.job.DSSJob handle to interact with it.

You need to wait for the returned job to complete

* Returns: a job handle

* Return type: `dataikuapi.dss.job.DSSJob`

`start_and_wait`(*no\_fail=False*)

Starts the job, waits for it to complete and returns a dataikuapi.dss.job.DSSJob handle to interact with it

Raises if the job failed.

* Parameters: **no\_fail** – if True, does not raise if the job failed (defaults to **False**).

* Returns: A job handle

* Return type: `dataikuapi.dss.job.DSSJob`

*class* `dataikuapi.dss.job.``DSSJob`(*client*, *project\_key*, *id*)

A job on the DSS instance

`abort`()

Aborts the job

`get_status`()

Get the current status of the job

* Returns:: the state of the job, as a JSON object

`get_log`(*activity=None*)

Get the logs of the job

* Args:: 
> 
> activity: (optional) the name of the activity in the job whose log is requested
> 
> 
> 
Returns:the log, as a string

* Returns:: the log, as a string

*class* `dataikuapi.dss.job.``DSSJobWaiter`(*job*)

Helper to wait for a job’s completion

`wait`(*no\_fail=False*)
