To provide these insights and predictions in a real production environment, here are some best practices to follow:

 1. The amount of data to be processed will usually be significant, so powerful engines will be needed to compute it. The best practice is to put the input datasets into SQL databases such as Snowflake, Redshift, BigQuery, or Azure Synapse. This way, the scenario will be able to automatically perform all the operations directly in the SQL engines without having to load the entire datasets into memory.
 
 2. Once all the steps of the Dataiku Application are configured and built, consider looking into the [default training parameters](analysis:guA99AD7) and adjusting the algorithms, features handling, and metrics to optimize performance for your specific use case. Once you find the most suitable model, deploy it to replace the existing one.
 
 3. Everytime a prediction is made, it has to be computed on the last available batch data to accurately help the preventive maintenance process. To avoid rebuilding the entire flow on all the history, the input datasets should be [partitioned](https://doc.dataiku.com/dss/latest/partitions/index.html), and you should configure all the next recipes according to this partition (at least the first pivot, join, and groupy recipes that take the most time). Then you should configure the [prediction scenario](scenario:PREDICT) to build the prediction only on the last partitions of the data.