# Sampling datasets[¶](https://doc.dataiku.com/dss/latest/other_recipes/sampling.html#sampling-datasets "Permalink to this headline")

* Filtering in DSS

+ Rules based filters

- Conditions

- Groups

- Boolean operators

+ Formula based filters

+ SQL expression based filters

+ ElasticSearch query string

The “sample/filter” recipe serves the dual purpose of sampling and/or filtering dataset.

## Filtering in DSS[¶](https://doc.dataiku.com/dss/latest/other_recipes/sampling.html#filtering-in-dss "Permalink to this headline")

4 types of filtering are available and can be selected using the top dropdown menu :

* Rules based

* Formula based

* SQL expression based

* ElasticSearch query string (only available when the input dataset is on ElasticSearch v7 and above)

### Rules based filters[¶](https://doc.dataiku.com/dss/latest/other_recipes/sampling.html#rules-based-filters "Permalink to this headline")

A filter is defined by a list of possibly grouped conditions and the boolean operators that bind them.

#### Conditions[¶](https://doc.dataiku.com/dss/latest/other_recipes/sampling.html#conditions "Permalink to this headline")

A Condition is defined by an input column, an operator, and a value.

* Input column : choose any column from the dataset.

* operator : choose an operator from the dropdown menu. The available operators match the storage type of the column. (a string column will have string operators available, such as `contains`, while a number column will have numerical operators available, such as `<`).

* value : input a value or choose an existing column to apply the operator to.

Conditions can be added, deleted, duplicated, and turned into a group to create advanced conditions.

#### Groups[¶](https://doc.dataiku.com/dss/latest/other_recipes/sampling.html#groups "Permalink to this headline")

Groups can be used to create advanced logic for conditional statements. Groups can be nested to create sub-conditions `(y AND z AND (t OR u)))` or defined at the same level `(y OR z) AND (t OR u)`. Groups can be added using the +ADD > Add group button, deleted, duplicated, and ungrouped.

#### Boolean operators[¶](https://doc.dataiku.com/dss/latest/other_recipes/sampling.html#boolean-operators "Permalink to this headline")

Conditions and groups are binded using boolean operators, that can be either `And` or `Or`.

### Formula based filters[¶](https://doc.dataiku.com/dss/latest/other_recipes/sampling.html#formula-based-filters "Permalink to this headline")

Formulas are manually defined using functions of the formula language, dataset column names, and project variables. Formulas are well suited for more complex filtering options or specific functions that do not appear in the rules based filter view. The formula language documentation can be found here.

### SQL expression based filters[¶](https://doc.dataiku.com/dss/latest/other_recipes/sampling.html#sql-expression-based-filters "Permalink to this headline")

When using an SQL based recipe engine, an SQL expression can directly be given to filter the dataset, using dataset columns and project variables.

### ElasticSearch query string[¶](https://doc.dataiku.com/dss/latest/other_recipes/sampling.html#elasticsearch-query-string "Permalink to this headline")

When using an input dataset on ElasticSearch v7 and above, you can use the query\_string syntax to filter the dataset.

Note

When using an ElasticSearch query string, sampling is disabled and filtering is performed on the whole dataset.
