# Data Model requirements

Having a transactions historical is **mandatory**, since analyzing transactions is at the core of the project.

**Data format requirement**: 1 row =  The information of ***an item*** (item identifier column | [What is an item ?](article:20)) ***purchased in a transaction*** (transaction identifier column) ***at a given date***(dates column).
Optionally, a row can also have the information of : 
- ***The customer*** (customer identifier column) that purchased the transaction. That field becomes mandatory if you want to compute recommendations for your customers. 
- ***Transaction context information***  [Learn more about the transaction context](article:30).

The dataset below illustrates the expected format in terms of columns : 
![transactions_dataset.png](BVEy7emz8szm)
- The item identifier column is ***Description***.
  - Here ***Description***  gives the information of a product name (thus, an item is a product). In the first row, the
  product that was purchased was "WHITE HANGING HEART T-LIGHT HOLDER".
- The transaction identifier column is ***InvoiceNo*** .
  - Here the 7 first rows each belong to the same transaction, which has the ID "536365". 
- The transaction date column is ***InvoiceDate***.
  - Here the first transaction was done on "2010-12-01T08:26:00.000Z"
  - :warning:  Dates must be mandatory stored with [ISO-8601 or RFC 822](https://doc.dataiku.com/dss/latest/preparation/dates.html#meanings-and-types) formats if you work with  _Microsoft SQL server_  ([See the compatible data storages](article:59)).

Optionally we have : 
- ***CustomerID*** as a customer identifier column.
- ***Country*** as a column giving transaction context information.
 
# Data storage requirements

The Market Basket Analysis is designed to work with SQL storages.
To benefit natively from all the Dataiku application automation, you will need one of the following connections :
- Snowflake
- Google Cloud Platform:
    - BigQuery + GCS  **(⚠️Both are required if you want to leverage BigQuery)**
- Azure:
    - Azure Blob Storage
- PostgreSQL
- Microsoft SQL server
    - :warning:  Dates must be mandatorily stored with [ISO-8601 or RFC 822](https://doc.dataiku.com/dss/latest/preparation/dates.html#meanings-and-types) formats if you work with this storage.

Roll-out and customization services can be offered on demand for other data storage connections.