<span id="version" style="color: grey; float: right">Version 1.0.3</span>

# Objective

This project's goal is to create an insurance pricing model based on historical claim data. The datasets used for this study contain a car insurance policy per line, with information on the driver and the vehicle and a separate dataset with claim amounts that occured on the period. The pricing model aims at setting a price for each potential policyholder that will accurately reflect its actual risk.

Variables such as Claim Frequency and Claim Severity exhibit very specific behaviors that are usually modelled using distributions other than the normal distribution. One approach to this challenges uses Generalized Linear Models (GLMs). Unlike popular Machine Learning algorithms such as GBMs or Neural Networks, GLMs (due to their linear component) force rigid dependencies between features and the response. As a result, iterative feature engineering is especially crucial to make sure these dependencies match intuition and do not bias the results.

# Value

This project acts as template of how an actuary could use Dataiku to perform their work. The value of using Dataiku for this kind of modeling can be broken down into the following three components:
 - [Generalized Linear Model Fitting](article:3): Visual Machine Learning includes GLMs allowing actuaries to model claims inside a robust and visual environment.
 - [Extensive Exploratory Data Analysis and Feature Processing](article:4): input data can come in many different forms (numeric, categorical, text) and often requires cleaning and preparation before moving on to modeling. Built-in exploration and graphing capabilities allow the user to get a better understanding of the data.
 - [Push to Production through an API](article:5): finalized models can easily be deployed through an API, and their output consumed outside Dataiku or in a webapp.