# Custom Python functions in the Prepare Recipe[¶](https://knowledge.dataiku.com/latest/kb/data-prep/prepare-recipe/python.html#custom-python-functions-in-the-prepare-recipe "Permalink to this headline")

The Prepare recipe holds processors to handle the most common data preparation tasks. For more customized tasks, you can craft operations in the Formula language.

For even more complex tasks, or according to your preference, you can also use the Python function processor.

This processor allows you to perform complex row-wise operations on your dataset. The output of the function depends on which of the three modes is used:

| Mode | Returns |

| --- | --- |

| Cell | A new cell for each row |

| Row | A row for each row |

| Rows | A list of rows for each row |

## Cell mode[¶](https://knowledge.dataiku.com/latest/kb/data-prep/prepare-recipe/python.html#cell-mode "Permalink to this headline")

In cell mode, DSS expects a single column as output. This is useful for finding the largest value across columns, or the sum of a set of columns whose names follow some pattern.

In the example below, the `process()` function finds the longest string value across the columns in the dataset.

§ def process(row):

§ # In 'cell' mode, the process function must return

§ # a single cell value for each row

§ # The 'row' argument is a dictionary of columns of the row

§ max\_len = -1

§ longest\_str = None

§ for val in row.values():

§ if val is None:

§ continue

§ if len(val) > max\_len:

§ max\_len = len(val)

§ longest\_str = val

§ return longest\_str

## Row mode[¶](https://knowledge.dataiku.com/latest/kb/data-prep/prepare-recipe/python.html#row-mode "Permalink to this headline")

In row mode, DSS expects a row as output, which replaces the existing row. This is useful when you need to perform operations in place on the rows (as in the image below), or when you need to add multiple rows to the dataset (such as if you compute the mean and standard deviation of values in a row across columns).

In the example below, the `process()` function adds 5 to the integer-valued columns. Note that the `row` parameter passed to the `process()` function is a dictionary of the string values, so we use the `ast` package to evaluate whether they are integer-valued.

§ import ast

§ def process(row):

§ # In 'row' mode, the process function must return the full row.

§ # You may modify the 'row' in place to

§ # keep the previous values of the row.

§ for i in row.keys():

§ try:

§ isint = type(ast.literal\_eval(row[i])) is int

§ except:

§ isint = False

§ if isint:

§ row[i] = int(row[i]) + 5

§ return row

## Rows mode[¶](https://knowledge.dataiku.com/latest/kb/data-prep/prepare-recipe/python.html#rows-mode "Permalink to this headline")

In rows mode, DSS expects one or more rows as output, which replace the existing row. This is especially useful when you need to transform your data from wide to long format.

In the example below, the `process()` function splits each row into two. Where before a single row contained both a person’s work and home zip code and state, now each row contains either the home or work information, along with a new column that indicates whether it is the home or work information.

§ # Modify the process function to fit your needs

§ def process(row):

§ # In 'multi rows' mode, the process function

§ # must return an iterable list of rows.

§ ret = []

§ home = {"name": row["name"],

§ "type": "home",

§ "zip": row["home\_zip"],

§ "state": row["home\_state"]}

§ work = {"name": row["name"],

§ "type": "work",

§ "zip": row["work zipcode"],

§ "state": row["work state"]}

§ ret.append(home)

§ ret.append(work)

§ return ret
