This project requires the following technical specifications:

# Instance Requirements

This solution is only compatible with instances with  **Dataiku DSS V14+**


## Code Environment
Project python recipes uses the PYTHON 3.9 code env **solution_product-recommendations**.


Required packages for this code env are: 
>numpy<1.27
Flask==2.0.2
scikit-learn>=1.0,<1.6
pillow==11.3.0
scikit-image==0.24.0
opencv-python-headless==4.11.0.86
imageio==2.37.0
Werkzeug==2.3.7

## Mandatory plugins

You will need to install on your DSS instance: 
- [Dataiku's Recommendation System Plugin](https://www.dataiku.com/product/plugins/recommendation-system/).
- [Dataiku's Sankey charts Plugin](https://www.dataiku.com/product/plugins/flow-charts/).

### Installing a DSS plugin
[Learn how to install a DSS plugin](https://doc.dataiku.com/dss/latest/plugins/installing.html).


## Datasources requirements
Some data sources are mandatory in order to use the solution, while others will be optional but could improve the ***Product Recommendations*** machine learning model performances.

### :arrow_forward: **<span style="color:#222222">Mandatory Datasources</span>**

#### [interactions_history](dataset:interactions_history)
   It is the core of the industry solution and contains records of all the historical interactions between users and items.
   
**Data format requirement**: 1 row =  The information of:
- Mandatory columns:
    - ***A customer***  (User ID).
    - ***That interacted with a product or service***  (Item ID).
    - ***At a given date*** (Date column).
   
- Optional columns:
    - ***The revenue*** generated by the customer interaction with the product or service (revenue identifier column).
   
### :arrow_forward: **<span style="color:#e0c587">Optional Datasources</span>**

#### [user_metadata](dataset:user_metadata)
This is a dataset providing information about all your customers.

**Data format requirement**: 1 row =  The information of:
- Mandatory columns:
    - ***A customer identifier***  (Customer ID).
    
- Optional columns:
    - ***Customers age column*** if you want to leverage the application [ages_clustering](article:53) and [age_feature_engineering](article:57) capabilities.
    - ***Any other columns describing your customer*** if you want to enrich the machine learning model with this information.


#### [item_metadata](dataset:item_metadata)
This is a dataset providing information about all your products.

**Data format requirement**: 1 row =  The information of:
- Mandatory columns:
    - ***A product or service identifier***  (Item ID).
    
- Optional columns:
    - ***Any categorical columns describing your products*** if you want to leverage [product characteristics collaborative filtering](article:56).
    - ***Any other datatype columns describing your products*** if you want to enrich the machine learning model with this information.
    - ***The name of the '.jpg' picture file***  (Picture name) if you want to leverage pictures in the [project's webapp](article:22).

#### [item_pictures](managed_folder:oqdDCckT)
This is a folder containing all your product pictures: leveraging it would facilitate the audit of the model's behavior within the [project webapp](article:22)
⚠️: ***Mandatory format: '.jpg'***


## Data storage requirements
The ***Product Recommendations*** industry solution needs SQL databases to use the [Recommendation System Plugin](https://www.dataiku.com/product/plugins/recommendation-system/) as its recipes depend on a SQL engine to run memory-intensive processes.

This solution is compatible with the following connections :
- PostgreSQL
- Snowflake

Other connections compatible with the plugin could be integrated, among which: *Google BigQuery, Microsoft SQL Server and Azure Synapse*: Roll-out and customization services can be offered on demand for one of these data storage connections.
