# Execution engines[¶](https://doc.dataiku.com/dss/latest/preparation/engines.html#execution-engines "Permalink to this headline")

* Design of the preparation

* Execution in analysis

* Execution of the recipe

+ DSS

+ Spark

+ In-database (SQL)

* Details on the in-database (SQL) engine

+ Supported processors

+ Partially supported processors

* Details on the Spark engine

## Design of the preparation[¶](https://doc.dataiku.com/dss/latest/preparation/engines.html#design-of-the-preparation "Permalink to this headline")

The design of a data preparation is always done on an in-memory sample of the data. See Sampling for more information.

## Execution in analysis[¶](https://doc.dataiku.com/dss/latest/preparation/engines.html#execution-in-analysis "Permalink to this headline")

When in an analysis, execution on the whole dataset happens when:

* Exporting the prepared data

* Running a machine learning model

In both cases, this uses a streaming engine: all data goes through the DSS server but does not need to be in memory.

## Execution of the recipe[¶](https://doc.dataiku.com/dss/latest/preparation/engines.html#execution-of-the-recipe "Permalink to this headline")

For execution of the recipe, DSS provides three execution engines:

### DSS[¶](https://doc.dataiku.com/dss/latest/preparation/engines.html#dss "Permalink to this headline")

All data goes through the DSS server but does not need to be in memory (as it is streamed)

### Spark[¶](https://doc.dataiku.com/dss/latest/preparation/engines.html#spark "Permalink to this headline")

When Spark is installed (see: DSS and Spark), preparation recipe jobs can run on Spark.

We recommend that you only use this on the following dataset types that support fast read and write on Spark:

* S3

* Azure Blob Storage

* Google Cloud Storage

* Snowflake

* HDFS

### In-database (SQL)[¶](https://doc.dataiku.com/dss/latest/preparation/engines.html#in-database-sql "Permalink to this headline")

A subset of the preparation processors can be translated to SQL queries. When all processors in a preparation recipe can be translated, and both input and output are tables in the same SQL connection, the recipe runs fully in-database.

Please see the warnings and limitations below

## Details on the in-database (SQL) engine[¶](https://doc.dataiku.com/dss/latest/preparation/engines.html#details-on-the-in-database-sql-engine "Permalink to this headline")

Only a subset of processors can be translated to SQL queries. They are documented in the processors reference. The SQL engine can only be selected if all processors are compatible with it.

If you add a non-supported processor while the in-database engine is selected, DSS will show which processor cannot be used with details.

Note

There are some edge cases of columns that change type where DSS may show the engine as supported, but upon running the recipe, you encounter a syntax error. If that happens, you will need to disable the SQL engine and fall back to the DSS engine.

Some of these edge cases relate to type conflicts, if for example you have a textual column and perform a find/replace operation that transforms it into a numerical column and immediately use it for numerical operations.

When using Snowflake, additional processors are supported thanks to unique extended push-down capabilities. Please see Snowflake for more details. Additional setup is required to benefit from extended push-down.

### Supported processors[¶](https://doc.dataiku.com/dss/latest/preparation/engines.html#supported-processors "Permalink to this headline")

These processors are available with SQL processing

* Keep/Delete columns

* Reorder columns

* Rename columns

* Split columns

* Filter by alphanumerical value

* Filter by numerical range

* Flag by alphanumerical value

* Flag by numerical range

* Remove rows with empty value

* Fill empty cells with value

* Concatenate columns

* Copy columns

* Unfold

* Split and unfold

* (Snowflake only) Discretize (bin) numerical values

* (Snowflake only) Convert currencies

* (Snowflake only) Currency Splitter

* (Snowflake only) Split e-mail addresses

* (Snowflake only) Extract numbers

* (Snowflake only) Resolve GeoIP

* (Snowflake only) Flag holidays

* (Snowflake only) Normalize measure

* (Snowflake only) Split HTTP Query String

* (Snowflake only) Simplify text

* (Snowflake only) Convert a UNIX timestamp to a date

* (Snowflake only) Split URL (into protocol, host, port, …)

* (Snowflake only) Classify User-Agent

* (Snowflake only) Generate a best-effort visitor id

### Partially supported processors[¶](https://doc.dataiku.com/dss/latest/preparation/engines.html#partially-supported-processors "Permalink to this headline")

In some variants of configuration of the processor, it will revert to a normal processing. Various issues may also appear and require you to switch back to DSS engine.

* Formula (essentially same support as in other visual recipes)

* Filter by formula (see above)

* Flag by formula (see above)

* Find / Replace (especially around regular expressions)

* Transform string (depends on the transformation) -All are available with Snowflake

* Extract with regular expression - More options are available with Snowflake

* Date-handling processors (parse date, extract date components)

* Geo processors (extract from geo column, change coordinate reference system (CRS))

* (Snowflake only) Filter invalid rows/cells and Flag invalid rows (not supported for custom meanings)

## Details on the Spark engine[¶](https://doc.dataiku.com/dss/latest/preparation/engines.html#details-on-the-spark-engine "Permalink to this headline")

All processors are compatible with the Spark engine
