# Financial Forecasting[¶](https://knowledge.dataiku.com/latest/kb/business-solutions/financial-forecasting/financial-forecasting.html#financial-forecasting "Permalink to this headline")

Contents

* Overview

+ Business Case

+ Technical Requirements

+ Installation

+ Data Requirements

+ Workflow Overview

* Walkthrough

+ Tailor the Project to Our Own Needs

+ Cleaning and Preparing our Historical Financial Data

+ Identifying Trends in Historical Data

+ Simple and Advanced Forecasting

+ Visualize Approach Comparisons and Drivers

+ Responsible AI Considerations

+ Reproducing these Processes With Minimal Effort For Your Own Data

## Overview[¶](https://knowledge.dataiku.com/latest/kb/business-solutions/financial-forecasting/financial-forecasting.html#overview "Permalink to this headline")

### Business Case[¶](https://knowledge.dataiku.com/latest/kb/business-solutions/financial-forecasting/financial-forecasting.html#business-case "Permalink to this headline")

The Financial forecasting processes managed by Finance teams play a central role in supporting companies to make appropriate cost management and investment decisions. Yet 40 percent of CFOs feel their forecasts are not accurate and that the process takes too much time. More precise, less costly-to-produce forecasts are of immediate value, but connecting to the data and tapping into the different techniques needed to achieve them can feel out of reach. This can be a result of too little time available to dedicate to setting up a new forecasting project or a lack of confidence in the statistical and machine learning techniques involved.

Enhancing the efficiency of Financial Forecasting requires:

* Improving the capacity for Finance teams to quickly access data and automate data pipelines. Rather than relying on manual checks and merges via spreadsheets, teams can streamline their processes, save time and reduce errors, allowing them to focus more on analysis and decision making.

* Easing the comparison of traditional and advanced statistical / machine-learning forecasting techniques, with simple tests and selection of appropriate drivers, developing more accurate projections alongside full ownership and explainability.

Dataiku’s Financial Forecasting Solution offers Finance teams an opportunity confidently transition through a transformative shift in their business impact while retaining full control of process and outputs.

### Technical Requirements[¶](https://knowledge.dataiku.com/latest/kb/business-solutions/financial-forecasting/financial-forecasting.html#technical-requirements "Permalink to this headline")

To leverage this solution, you must meet the following requirements:

* Have access to a DSS 11.0+ instance.

* A Python 3.7+ code environment named `solution\_financial-forecasting` with the following required packages:

§ pmdarima == 2.0.2

§ numpy==1.21.6

Note

When creating a new code environment, please be sure to use the name `solution\_financial-forecasting` or remapping will be required.

### Installation[¶](https://knowledge.dataiku.com/latest/kb/business-solutions/financial-forecasting/financial-forecasting.html#installation "Permalink to this headline")

If the technical requirements are met, this solution can be installed in one of two ways:

* On your Dataiku instance click **+ New Project** > **Business solutions** > Search for **Financial Forecasting**.

* Download the .zip project file and upload it directly to your Dataiku instance as a new project.

### Data Requirements[¶](https://knowledge.dataiku.com/latest/kb/business-solutions/financial-forecasting/financial-forecasting.html#data-requirements "Permalink to this headline")

Two input datasets are required for the solution

* *historical\_ts\_data* that includes historical time series data about the financial variable forecast, drivers, and manual forecast:

+ date: sequence of dates taken at successive equally spaced points in time

+ actual\_value: historical value of the quantitative variable the user wants to forecast

+ category: Subcategories into which the target value is split

+ manual\_forecast: Forecasts figures computed manually by the user

+ (optional) driver\_n\_name : company specific or macroeconomic information chosen by the user

* *to\_forecast\_data* data about the period we want to forecast, including drivers’ expected values and manual forecast

+ date: sequence of dates taken at successive equally spaced points in time

+ category: Subcategories into which the target value is split

+ manual\_forecast: Forecasts figures computed manually by the user

+ (optional) driver\_n\_name : company specific or macroeconomic information chosen by the user

The two input datasets must have the same time-frequency.

### Workflow Overview[¶](https://knowledge.dataiku.com/latest/kb/business-solutions/financial-forecasting/financial-forecasting.html#workflow-overview "Permalink to this headline")

You can follow along with the solution in the Dataiku gallery.

The project has the following high-level steps:

Configure the project with our data and adjustments via the Dataiku Application.

Clean and prepare our financial data.

Explore our historical data to identify trends and seasonality.

Create two distinct forecast options and cross-evaluate performance.

Visualize Financial Forecasts to compare approaches and analyze the relationship between drivers and value.

## Walkthrough[¶](https://knowledge.dataiku.com/latest/kb/business-solutions/financial-forecasting/financial-forecasting.html#walkthrough "Permalink to this headline")

Note

In addition to reading this document, it is recommended to read the wiki of the project before beginning to get a deeper technical understanding of how this solution was created, the different types of data enrichment available, longer explanations of solution-specific vocabulary, and suggested future direction for the solution.

### Tailor the Project to Our Own Needs[¶](https://knowledge.dataiku.com/latest/kb/business-solutions/financial-forecasting/financial-forecasting.html#tailor-the-project-to-our-own-needs "Permalink to this headline")

To begin, you will need to create a new instance of the Financial Forecasting Application. This can be done by selecting the Dataiku Application from your instance home and clicking **Create App Instance**. The project is delivered with sample data that should be replaced with our data, assuming that it adopts the data model described above This can be done in one of two ways:

Data can be uploaded directly from our filesystem in the first section of the Dataiku app

Data can be connected to your database of choice by selecting an existing connection

In both options, and loading in the data, be sure to **refresh the page so that the app can dynamically take your data into account.**

With our data selected and loaded into the flow, we can move to the final App section **Forecast financial data**. Here we can select the number of lag values and the drivers from our historical data to include in the advanced forecasting model. Clicking **RUN** will initiate a series of scenarios that rebuilds the entire flow and updates the Dashboard. If you’re only interested in the Dashboard you can skip the following sections where we’ll dive into the underlying Flow that supports the Dataiku App.

### Cleaning and Preparing our Historical Financial Data[¶](https://knowledge.dataiku.com/latest/kb/business-solutions/financial-forecasting/financial-forecasting.html#cleaning-and-preparing-our-historical-financial-data "Permalink to this headline")

In total, ten flow zones are involved in Data Prep and cleaning for this Solution. We won’t go into heavy detail about each flow zone as this information can be found in the wiki of the project. However, at a high level, the first seven flow zones found towards the start of our flow perform data normalization and lag values creation from our originating input datasets.

To deal with potential differences in the magnitude of values within each provided category, we must use a min-max normalization technique on the target variable. By applying a linear transformation on the original time series data, we can scale numeric features into a given range (e.g., 0 to 1 or -1 to 1). Since we are working with time series data, lag value creation is an important part of data prep to rationalize historical data against a given reference point. In this solution, compute lag values refer to the number of previous periods used by the advanced forecast method to predict the value of the next horizons.

After our data has been used to train, validate, and score our forecast models, three final Data Prep flow zones are called upon to flag the date of the last actual value, convert normalized values back into actual values, and prepare our data for metric calculation. These three flow zones are important to get our data into a format that can be used to generate visualizations and metrics for the Dashboard.

### Identifying Trends in Historical Data[¶](https://knowledge.dataiku.com/latest/kb/business-solutions/financial-forecasting/financial-forecasting.html#identifying-trends-in-historical-data "Permalink to this headline")

Before training a financial forecasting model, it’s important first to explore our historical data. Doing so allows us to manually identify trends and seasonality of the financial values we want to forecast. The **Exploratory data analysis** flow zone computes all metrics and values needed to generate charts for the **Data Exploration** slide of the Dashboard. Here we can zoom in on specific categories, filter data by date range and category, and compare actual values against manual forecast values over time. Additionally, we can see a breakdown of the actual value by category and the percentage of change in actual values over a period of time. Explanation boxes accompany each graph of this slide.

### Simple and Advanced Forecasting[¶](https://knowledge.dataiku.com/latest/kb/business-solutions/financial-forecasting/financial-forecasting.html#simple-and-advanced-forecasting "Permalink to this headline")

This Solution creates two distinct forecasts:

* Simple Forecast uses time series models to forecast future values over the next horizons per category

* Advanced forecast uses an Extra Random Tree regression model to forecast the next horizons.

The Simple Forecast creates an ARIMA model for each category using the AutoARIMA functionality in python to select the best parameters. The creation of this forecast serves to provide us with an initial approach for Financial Forecasting and, on its own, can still be a valuable comparison point against Manual Forecasting in terms of accuracy and time saved for Finance Teams. Additionally, the forecast and horizon values output by the Simple Forecast is included as predictors in the Advanced Forecast.

The Advanced Forecast uses an Extra Random Tree regression model which samples a random threshold at which to do the normal splits that occur in a Random Forest model. If we set lag values and drivers when inputting parameters to the Dataiku Application, the Advanced Forecast will also take these inputs into account when training the model.

The inevitable question is, which approach is better? In anticipation of this question, this Solution computes the Mean absolute percentage error (MAPE) to compare the performance between the forecasting methods (including Manual Forecast). The lower the MAPE, the better the forecast is. On our data, the Advanced Forecast performed the best but performance can differ based on your data and additional improvements you may make to the model.

### Visualize Approach Comparisons and Drivers[¶](https://knowledge.dataiku.com/latest/kb/business-solutions/financial-forecasting/financial-forecasting.html#visualize-approach-comparisons-and-drivers "Permalink to this headline")

In the same Dashboard as the previously mentioned **Data Exploration** slide, we can find four other slides to support our understanding of the Solution Approaches and Outputs. Filters are available on every slide by date, category, and/or driver.

The **Forecast Comparison** slide presents the results of all Forecasting approaches (Simple, Advanced, and Manual) for a side-by-side comparison of their MAPE, the alignment to the Actual Value overall, and by category. We can also analyze the performance of forecasts over different time horizons and gain insights into cumulative errors and average error percentages.

With the **Drivers** slide, we can analyze the relationship between drivers contained in our data (and selected by the Dataiku app) and the actual value. Through this, we can determine which drivers may have a potential positive or negative impact on the advanced forecast. For users interested in running multiple analyses with different drivers, they can either re-run the Dataiku application with new parameters for each analysis or create multiple App instances, each with their own desired set of drivers.

Our final two slides **Simple Forecast** and **Advanced Forecast** offer isolated metrics and visualizations per approach type regarding their accuracy as well as information about the models through explainability visualizations.

### Responsible AI Considerations[¶](https://knowledge.dataiku.com/latest/kb/business-solutions/financial-forecasting/financial-forecasting.html#responsible-ai-considerations "Permalink to this headline")

This Financial forecasting solution is intended for use at a business level and should not be used to evaluate individual performance. Misuse of this solution, such as using it to make decisions that may lead to individual potential harm, may result in inaccurate or unreliable forecasts.

### Reproducing these Processes With Minimal Effort For Your Own Data[¶](https://knowledge.dataiku.com/latest/kb/business-solutions/financial-forecasting/financial-forecasting.html#reproducing-these-processes-with-minimal-effort-for-your-own-data "Permalink to this headline")

The intent of this project is to enable finance teams to understand how Dataiku can be used to improve accuracy and reduce the required effort for forecasting processes. By creating a singular solution that can benefit and influence the decisions of various teams in a single organization, smarter and more holistic strategies can be designed to transform existing forecasting processes, automate and streamline workflows, and focus on more strategic tasks.

We’ve provided several suggestions on how to use your historical financial data, but ultimately the “best” approach will depend on your specific needs and your data. If you’re interested in adapting this project to the specific goals and needs of your organization, roll-out and customization services can be offered on demand.
