# The Javascript API[¶](https://doc.dataiku.com/dss/latest/api/js/index.html#the-javascript-api "Permalink to this headline")

* Fetching dataset data

* The DataFrame object

* Sampling

+ sampling = ‘head’

+ sampling = ‘random’

+ sampling = ‘full’

+ sampling = ‘random-column’

* Partitions selection

* Columns selection

* Rows filtering

The Dataiku Javascript API allows you to write custom Web apps that can read from the Dataiku datasets.

## Fetching dataset data[¶](https://doc.dataiku.com/dss/latest/api/js/index.html#fetching-dataset-data "Permalink to this headline")

* 
`dataiku.``fetch`(*datasetName*, [*options*, ]*success*, *failure*)[¶](https://doc.dataiku.com/dss/latest/api/js/index.html#dataiku.fetch "Permalink to this definition"): Returns a DataFrame object with the contents of a dataset
Arguments
* **datasetName** (*string*) – Name of the dataset. Can be in either of two formats
	+ projectKey.datasetName
	+ datasetName. In this case, the default project will be searched
* **options** (*dict*) – Options for the call. Valid keys are:
	+ apiKey: forced API key for this dataset. By default, dataiku.apiKey is used
	+ partitions: array of partition ids to fetch. By default, all partitions are fetched.
	+ sampling: object representing sampling to apply. By default, the whole dataset is fetched (no sampling). See below for more details.
	+ filter: formula for filtering which rows are returned.
	+ limit : limit the number of rows to retrieve. **By default, this limit is set to 20000** (for safety reasons). See below for more details on sampling.
* **success** (*function(dataframe)*) – Gets called in case of success with a Dataframe object
* **failure** (*function(error)*) – Gets called in case of error

* Arguments: * **datasetName** (*string*) – Name of the dataset. Can be in either of two formats
	+ projectKey.datasetName
	+ datasetName. In this case, the default project will be searched
* **options** (*dict*) – Options for the call. Valid keys are:
	+ apiKey: forced API key for this dataset. By default, dataiku.apiKey is used
	+ partitions: array of partition ids to fetch. By default, all partitions are fetched.
	+ sampling: object representing sampling to apply. By default, the whole dataset is fetched (no sampling). See below for more details.
	+ filter: formula for filtering which rows are returned.
	+ limit : limit the number of rows to retrieve. **By default, this limit is set to 20000** (for safety reasons). See below for more details on sampling.
* **success** (*function(dataframe)*) – Gets called in case of success with a Dataframe object
* **failure** (*function(error)*) – Gets called in case of error

## The DataFrame object[¶](https://doc.dataiku.com/dss/latest/api/js/index.html#the-dataframe-object "Permalink to this headline")

*class* `DataFrame`()

Object representing a set of rows from a dataset.

DataFrame objects are created by dataiku.fetch

Interaction with the rows in a DataFrame can be made either:

* As “record” objects, which map each column name to value

* As “row” arrays. Each row array contains one entry per column

Using row arrays requires a bit more code and using getColumnIdx, but generally provides better performance.

* 
`DataFrame.``getNbRows`()[¶](https://doc.dataiku.com/dss/latest/api/js/index.html#DataFrame.getNbRows "Permalink to this definition"): 
Returns
the number of rows in the dataframe

* Returns: the number of rows in the dataframe

* 
`DataFrame.``getRow`(*rowIdx*)[¶](https://doc.dataiku.com/dss/latest/api/js/index.html#DataFrame.getRow "Permalink to this definition"): 
Returns
an array representing the row with a given row idx

* Returns: an array representing the row with a given row idx

* 
`DataFrame.``getColumnNames`()[¶](https://doc.dataiku.com/dss/latest/api/js/index.html#DataFrame.getColumnNames "Permalink to this definition"): 
Returns
an array of column names

* Returns: an array of column names

* 
`DataFrame.``getRows`()[¶](https://doc.dataiku.com/dss/latest/api/js/index.html#DataFrame.getRows "Permalink to this definition"): 
Returns
an array of dataframe rows. Each element of the array is what getRow would return

* Returns: an array of dataframe rows. Each element of the array is what getRow would return

* 
`DataFrame.``getRecord`(*rowIdx*)[¶](https://doc.dataiku.com/dss/latest/api/js/index.html#DataFrame.getRecord "Permalink to this definition"): 
Returns
a record object for the row with a given row idx. The keys of the object are the names of the columns

* Returns: a record object for the row with a given row idx. The keys of the object are the names of the columns

* 
`DataFrame.``getColumnValues`(*name*)[¶](https://doc.dataiku.com/dss/latest/api/js/index.html#DataFrame.getColumnValues "Permalink to this definition"): 
Arguments
* **name** (*string*) – Name of the column
Returns
an array containing all values of the column <name>

* Arguments: * **name** (*string*) – Name of the column

* Returns: an array containing all values of the column <name>

* 
`DataFrame.``getColumnIdx`(*name*)[¶](https://doc.dataiku.com/dss/latest/api/js/index.html#DataFrame.getColumnIdx "Permalink to this definition"): Returns the columnIdx of the column bearing the name name. This idx can be used to lookup in the array returned by getRow.
Returns -1 if the column name is not found.
Arguments
* **name** (*string*) – Name of the column
Returns
the columnIdx of the column or -1

* Arguments: * **name** (*string*) – Name of the column

* Returns: the columnIdx of the column or -1

* 
`DataFrame.``mapRows`(*f*)[¶](https://doc.dataiku.com/dss/latest/api/js/index.html#DataFrame.mapRows "Permalink to this definition"): Applies a function to each row
Arguments
* **f** (*function(row)*) – Function to apply to each row array
Returns
the array [ f(row[0]), f(row[1]), … , f(row[N-1]) ]

* Arguments: * **f** (*function(row)*) – Function to apply to each row array

* Returns: the array [ f(row[0]), f(row[1]), … , f(row[N-1]) ]

* 
`DataFrame.``mapRecords`(*f*)[¶](https://doc.dataiku.com/dss/latest/api/js/index.html#DataFrame.mapRecords "Permalink to this definition"): Applies a function to each record array
Arguments
* **f** (*function(record)*) – Function to apply to each record object
Returns
the array [ f(record[0]), f(record[1]), … , f(record[N-1]) ]

* Arguments: * **f** (*function(record)*) – Function to apply to each record object

* Returns: the array [ f(record[0]), f(record[1]), … , f(record[N-1]) ]

* 
`dataiku.``setAPIKey`(*apiKey*)[¶](https://doc.dataiku.com/dss/latest/api/js/index.html#dataiku.setAPIKey "Permalink to this definition"): Sets the API key to use. This should generally be the first thing called

* 
`dataiku.``setDefaultProjectKey`(*projectKey*)[¶](https://doc.dataiku.com/dss/latest/api/js/index.html#dataiku.setDefaultProjectKey "Permalink to this definition"): Sets the “search path” for projects. This is used to resolve dataset names given as
“datasetName” instead of “projectKey.datasetName”.

## Sampling[¶](https://doc.dataiku.com/dss/latest/api/js/index.html#sampling "Permalink to this headline")

Returning a whole dataset as a JS object is generally not possible due to memory reasons. The API allows you to sample the rows of the dataset, with option keys.

The **sampling** key contains the sampling method to use

For more details on the sampling methods, see Sampling

Note

The default sampling is **\*head(20000)\***: by default, only the first 20K rows are returned

### sampling = ‘head’[¶](https://doc.dataiku.com/dss/latest/api/js/index.html#sampling-head "Permalink to this headline")

Returns the first rows of the dataset

§ /\* Returns the first 15 000 lines \*/

§ {

§ sampling : 'head',

§ limit : 15000

§ }

### sampling = ‘random’[¶](https://doc.dataiku.com/dss/latest/api/js/index.html#sampling-random "Permalink to this headline")

Returns either a number of rows, randomly picked, or a ratio of the dataset

§ /\* Returns 10% of the dataset \*/

§ {

§ sampling : 'random',

§ ratio: 0.1

§ }

§ /\* Returns 15000 rows, randomly sampled \*/

§ {

§ sampling : 'random',

§ limit : 15000

§ }

### sampling = ‘full’[¶](https://doc.dataiku.com/dss/latest/api/js/index.html#sampling-full "Permalink to this headline")

No sampling, returns all

### sampling = ‘random-column’[¶](https://doc.dataiku.com/dss/latest/api/js/index.html#sampling-random-column "Permalink to this headline")

Returns a number of rows based on the values of a column

§ /\* Returns 15000 rows, randomly sampled among the values of column 'user\_id' \*/

§ {

§ sampling : 'random-column',

§ sampling\_column : 'user\_id',

§ limit : 15000

§ }

## Partitions selection[¶](https://doc.dataiku.com/dss/latest/api/js/index.html#partitions-selection "Permalink to this headline")

In the `partitions` key, you can pass in a JS array of partition identifiers

§ /\* Only returns data from two partitions \*/

§ {

§ partitions : ["2014-01-02", "2014-02-04"]

§ }

## Columns selection[¶](https://doc.dataiku.com/dss/latest/api/js/index.html#columns-selection "Permalink to this headline")

In the `columns` key, you can pass in a JS array of column names. Only these columns are returned

§ /\* Only returns two columns from the dataset \*/

§ {

§ columns : ["type", "price"]

§ }

## Rows filtering[¶](https://doc.dataiku.com/dss/latest/api/js/index.html#rows-filtering "Permalink to this headline")

In the `filter` key, you can pass a custom formula to filter the returned rows

§ /\* Only returns rows matching condition \*/

§ {

§ filter : "type == 'event' && price > 2000"

§ }
