# Hands-On Tutorial: Visual ML Features[¶](https://knowledge.dataiku.com/latest/courses/visual-machine-learning/advanced/visual-ml-enhancements/visual-ml-enhancements-hands-on.html#hands-on-tutorial-visual-ml-features "Permalink to this headline")

## Let’s Get Started[¶](https://knowledge.dataiku.com/latest/courses/visual-machine-learning/advanced/visual-ml-enhancements/visual-ml-enhancements-hands-on.html#let-s-get-started "Permalink to this headline")

In this hands-on tutorial, let’s cover some interesting features of the visual ML interface in Dataiku.

### Feature Encoding[¶](https://knowledge.dataiku.com/latest/courses/visual-machine-learning/advanced/visual-ml-enhancements/visual-ml-enhancements-hands-on.html#feature-encoding "Permalink to this headline")

One way you can improve model performance is to experiment with feature encoding methods. Dataiku provides these methods in the visual ML interface so you can model categorical features and date features.

You can select different encodings including:

* *Frequency encoding*, which replaces categories by their number of occurrences.

* *Ordinal encoding*, which assigns a unique integer value to each category according to an order defined by count or lexicography.

* *Target encoding*, which replaces each category with a numerical value computed based on the target values.

* *Cyclical datetime encoding*, which transforms datetime features (timestamps) into numerical features while preserving the cyclical significance of the date and time periods.

To learn more about these feature handling methods, visit the Features Handling page in the product documentation.

### Native Support for LightGBM Algorithm[¶](https://knowledge.dataiku.com/latest/courses/visual-machine-learning/advanced/visual-ml-enhancements/visual-ml-enhancements-hands-on.html#native-support-for-lightgbm-algorithm "Permalink to this headline")

LightGBM is a modern gradient boosting algorithm, often seen as a successor to XGBoost due to its comparable performance at a fraction of the training time.

To learn more about this, visit the In-memory Python (Scikit-learn / LightGBM / XGBoost) page in the product documentation.

### ML Task Queues[¶](https://knowledge.dataiku.com/latest/courses/visual-machine-learning/advanced/visual-ml-enhancements/visual-ml-enhancements-hands-on.html#ml-task-queues "Permalink to this headline")

With the ML Tasks queues feature, data scientists can decouple the feature engineering and model training steps by queueing training sessions. These training sessions can have different model designs; for example, they can use different feature handling methods or different algorithms.

Queueing model training sessions eliminates the need to wait for a session’s training to finish before preparing the next model design experiment. Furthermore, these training queues can be scheduled for execution when resources are more abundant to minimize their impact on other teams.

Now that you have some background on these features, we can begin the tutorial.

## Getting Started[¶](https://knowledge.dataiku.com/latest/courses/visual-machine-learning/advanced/visual-ml-enhancements/visual-ml-enhancements-hands-on.html#getting-started "Permalink to this headline")

You will need a Dataiku project that contains a predictive model. To access and create the starter project:

* From the Dataiku homepage, click **+New Project > DSS Tutorials > General Topics > Credit Card Fraud (Tutorial)**.

Note

You can also download the starter project from this website and import it as a zip file.

You should now be on the project’s homepage.

* Go to the Flow and Select **Build all** from the **Flow Actions** button in the bottom right of the Flow.

* Go to the project’s **Visual Analyses** page from the top navigation bar.

* Open the existing **Prediction Model** analysis.

* Click the **Models** tab to land on the “Result” page of the analysis.

Here, you can see that there is a previously trained session with two models.

### Add a LightGBM Model to an ML Task Queue[¶](https://knowledge.dataiku.com/latest/courses/visual-machine-learning/advanced/visual-ml-enhancements/visual-ml-enhancements-hands-on.html#add-a-lightgbm-model-to-an-ml-task-queue "Permalink to this headline")

Let’s continue with this visual analysis by creating a new model training session that uses the LightGBM algorithm.

* Click the **Design** tab.

* Go to the **Algorithms** panel.

* Click the slider next to **LightGBM** to select it.

* Unselect the “Random Forest” and “Logistic Regression” algorithms.

Because we want to continue designing experimental models, we will wait to train the ML task that we just designed. Let’s add the task to a queue.

* Click the drop-down arrow next to the **Train** button.

* Click **Add To Queue**.

* Name the session `LightGBM`.

* Click **Add To Queue**.

The **Train** button has changed to **Train Queue**, and a new button **Add To Queue** appears next to it.

* Click the **Result** to switch to the Result page and notice that the training session for the current model design has been added to the queue.

### Queue a Session That Uses Frequency Encoding of a Categorical Feature[¶](https://knowledge.dataiku.com/latest/courses/visual-machine-learning/advanced/visual-ml-enhancements/visual-ml-enhancements-hands-on.html#queue-a-session-that-uses-frequency-encoding-of-a-categorical-feature "Permalink to this headline")

Let’s design another model training session. This time, we’ll include the Random Forest and Logistic Regression algorithms as well as the LightGBM one.

* In the previously trained “Session 1”, click the **Revert Design to This Session** icon (between the session name and the “Delete” icon).

* Click **Confirm** to use the design specified for Session 1.

* In the “Algorithms” panel, select the **LightGBM** algorithm in addition to the already selected “Random Forest” and “Logistic Regression” algorithms.

* Go to the **Features Handling** panel.

* Search for the “Merchant\_state” feature and select it. This feature is currently “dummy encoded.”

* Change the “Category handling” to use the **Frequency encoding** option.

* Click the **Add to Queue** button to add the training session for this ML task to the Queue.

* Name the session `merchant state - frequency encoding`.

* Click **Add to Queue**.

### Queue a Session That Uses Cyclical Encoding of a Numerical Feature[¶](https://knowledge.dataiku.com/latest/courses/visual-machine-learning/advanced/visual-ml-enhancements/visual-ml-enhancements-hands-on.html#queue-a-session-that-uses-cyclical-encoding-of-a-numerical-feature "Permalink to this headline")

Similarly, we’ll design another model training session. This time, we’ll use all three algorithms (Random Forest, Logistic Regression, and LightGBM).

* Go to the **Result** tab.

* In the previously trained “Session 1”, click the icon **Revert Design to This Session**.

* Click **Confirm** to use the design specified for Session 1.

* In the “Algorithms” panel, select the **LightGBM** algorithm in addition to the already selected “Random Forest” and “Logistic Regression” algorithms.

* Go to the **Features Handling** panel.

* Search for the “purchase\_date” feature and enable it.

Dataiku recognizes that *purchase\_date* is a date feature and selects the “cyclical datetime encoding” feature handling method.

Note

The Cyclical datetime encoding feature handling method is available to all numeric features.

* Click the **Add to Queue** button to add the training session for this ML task to the Queue.

* Name the session `purchase date - datetime encoding`.

* Click **Add to Queue**.

You can see a list of queued sessions from the Design tab. For this,

* Click the drop-down arrow next to **Train Queue**.

### Train the ML Task Queue[¶](https://knowledge.dataiku.com/latest/courses/visual-machine-learning/advanced/visual-ml-enhancements/visual-ml-enhancements-hands-on.html#train-the-ml-task-queue "Permalink to this headline")

We’re now ready to train the ML task queue. We’ll show two ways to do this. First, we’ll train the sessions right within the visual ML interface. Then, we’ll train the sessions by running a macro.

Let’s begin by training the queue from the visual ML interface.

* Go to the **Result** tab.

* Click the **Train Queue** button.

The first session in the queue, the “LIGHTGBM” session begins to train. When this session is done training, the next queued session begins training. Let’s interrupt the session training. To do this,

* Click **Abort** in the Result tab.

Dataiku informs you that aborting the training session will start training the next session in the queue. However, let’s pause the queue, so that Dataiku doesn’t immediately start training the models in the next session.

* Check the box next to “Pause queue before aborting.”

* Click **Confirm**.

Let’s now finish training the ML task queue by running a macro.

* From the top navigation bar, go to the **More Options (…)** menu and click **Macros**.

* In the section for “Builtin macros,” click **Train paused ML task queues**.

You have the option to run “all queues in the current project” or run a “single ML task queue”.

* Select to run **Single ML task queue**.

* Specify “Analysis”: **Prediction Model** and “ML Task”: **Predict authorized\_flag**.

* Click **Run Macro**.

* Click the test **Predict authorized\_flag (Prediction Model)** in the “Train paused ML task queues” pop-up window to open the Result tab of the Prediction Model analysis in a new window.

Notice that the ML task queue continues training from the next session (“Purchase Date - Datetime Encoding”). Training for the third session where we aborted training doesn’t get restarted.

* Wait for the session to finish training and observe the results.
