# Transfer Learning[¶](https://knowledge.dataiku.com/latest/courses/advanced-analytics/image-classification-code/transfer-learning.html#transfer-learning "Permalink to this headline")

The MNIST dataset is useful for learning how to train a convolutional network from scratch. However, for most image-related tasks, training a model from scratch takes a very long time, and will have weak performance unless you have a sufficiently large image archive.

We will now start with a pre-trained model that has learned another dataset such as ImageNet, and use a relatively small archive of 4000 images to perform transfer learning so that the retrained model can classify cats and dogs.

## Preparing the Data[¶](https://knowledge.dataiku.com/latest/courses/advanced-analytics/image-classification-code/transfer-learning.html#preparing-the-data "Permalink to this headline")

Use the same project we used with the MNIST data or create a new one. Create a new folder *cats\_dogs* in the flow and populate it with the uncompressed contents of the cats\_dogs archive.

In order to perform transfer learning for these images, we need to

Create train and test datasets that contain the path to each image (so that the model can find it) and the label that identifies the digit each image represents.

Download the specifications of a pre-trained model that will be retrained using these images.

As a first step, we’ll create a Python recipe that uses the *cats\_dogs* folder as an input and a new dataset *cats\_dogs\_labels* as the output. The code of the recipe uses the Dataiku Python API to retrieve the list of paths of all images in the folder (and its subfolders), extracts the paths, randomizes the order of records, and writes the paths as a column in the output dataset.

Randomizing the order of records ensures that each iterative training batch contains a mix of cats and dogs, and better performance than if all the cats and then all the dogs are used.

Note

In order to use this code in your project, you’ll need to change “EQysY5vS” to the identifier for the folder in your project.

§ # -\*- coding: utf-8 -\*-

§ import dataiku

§ import pandas as pd, numpy as np

§ from dataiku import pandasutils as pdu

§ # Read recipe inputs

§ folder = dataiku.Folder("EQysY5vS")

§ # Initialize dataframe

§ df = pd.DataFrame(columns=['path'])

§ # Populate dataframe with paths and labels

§ df['path'] = folder.list\_paths\_in\_partition()

§ # Randomize order of records

§ df = df.sample(frac=1).reset\_index(drop=True)

§ # Write recipe outputs

§ cats\_dogs\_labels = dataiku.Dataset("cats\_dogs\_labels")

§ cats\_dogs\_labels.write\_with\_schema(df)

Our next step is to use a Prepare recipe to extract the label from the image path.

Next, we’ll use a Split recipe to assign records into the train and test datasets.

Note

The quality of transfer learning for image classification is dependent upon the training & testing sets. In particular, different random splits of this set of 4000 images can give excellent or poor results. This underscores the importance of having a testing set, and may also show that 4000 images is not very many to train on for this application.

Finally, before going to the Lab, we need to download and place the pre-trained weights in a folder. Assuming you installed the Deep Learning for images plugin as noted in the Prerequisites, go the macro page of your project, click on the box **Download pre-trained model**, type `xception\_weights` as the output folder, and select the Xception architecture.

## The Deep Learning Model[¶](https://knowledge.dataiku.com/latest/courses/advanced-analytics/image-classification-code/transfer-learning.html#the-deep-learning-model "Permalink to this headline")

In a Visual Analysis for the training dataset (from the dataset’s Actions menu, Lab > Visual Analysis), create a new model with:

* **Prediction** as the task,

* *label* as the target variable

* **Deep learning** as the Expert mode, then click **Create**

This creates a new machine learning task and opens the Design tab for the task. On the Target panel, verify that Dataiku DSS correctly identifies this as a Two-class classification type of ML task.

### Features Handling[¶](https://knowledge.dataiku.com/latest/courses/advanced-analytics/image-classification-code/transfer-learning.html#features-handling "Permalink to this headline")

On the Features Handling panel, turn on *path* as an input, and select **Image** as its variable type.

Select the folder that contains the image archive. *IMPORTANT:* the trained model will look for images in this directory. If we want to score new images of cats and dogs, they will need to be placed in this folder.

We won’t use the default code, so just remove all the code. Then, click on {} Code Samples on the top right and select the **Default Keras preprocessing for image** code sample.

Insert the Keras code then change the resized width and height variables from `197` to `299`.

### Deep Learning Architecture[¶](https://knowledge.dataiku.com/latest/courses/advanced-analytics/image-classification-code/transfer-learning.html#deep-learning-architecture "Permalink to this headline")

We now have to create our network architecture in the `build\_model()` function. We won’t use the default architecture, so just remove all the code. Then, click on **{} Code Samples** on the top right and search for “images”. Open the **Pre-trained architecture to classify images (Xception)**, and scroll down to choose the **Architecture including weights download** option.

Insert the code; in order to retrain the model, we need to make a few changes to the code.

* In the line that defines `image\_shape` change `197, 197, 3` to `299, 299, 3`

* In the line that defines `image\_input\_name`, change `name\_of\_your\_image\_input\_preprocessed` to `path\_preprocessed`.

* In the line that defines `folder`, change `name\_of\_folder\_containing\_xception\_weights` to `xception\_weights`.

* Add the following lines after the call to `base\_model = Xception(include\_top=False,input\_tensor=image\_input)`. This preserves the Xception layers because they are already modified to distinguish visual features. Sometimes we will want to retrain these layers, but not now.

§ for layer in base\_model.layers:

§ layer.trainable = False

### Deep Learning Training Settings[¶](https://knowledge.dataiku.com/latest/courses/advanced-analytics/image-classification-code/transfer-learning.html#deep-learning-training-settings "Permalink to this headline")

Now that our architecture is set up, go the **Training** panel, then click on **Advanced mode** in the upper-right corner. In this code editor we will add image augmentation and a callback.

* First, replace the whole existing code with the content of the **Augmentation of images** sample.

* In the added `build\_sequences()` function, change the batch size from `8` to `16`, and replace the image input name with `path\_preprocessed`.

`ImageDataGenerator` defines how some images fed to the neural network will be randomly altered. Those alterations are subtle enough so we can distinguish its content, and their goal is to help the model generalize to unseen pictures.

`DataAugmentationSequence` defines the augmented training sequence, and takes as inputs the existing training sequence, the input containing the images (in this case, `path\_preprocessed`), the augmentations to be performed that we just defined, and the number of augmentations to perform. The number of augmentations corresponds to the number of times an image is augmented in the same batch. When in doubt, leave this parameter set to 1.

We’ll then add a new callback, a small class that can apply an operation on the model at each batch or at each epoch. You can choose one from the Keras API or code your own. This callback will reduce the learning rate as model performance on the validation set stops improving.

* Add the whole **Decrease the learning rate when no improvement in performance (Callback)** code sample to the code snippet.

## Model Results[¶](https://knowledge.dataiku.com/latest/courses/advanced-analytics/image-classification-code/transfer-learning.html#model-results "Permalink to this headline")

Click **Train**. If you’re running on CPU, it will take some time to complete. When the training finishes, deploy the model to the flow, create an evaluation recipe from the model, and evaluate on the test data. In our example, you can see that the model has an accuracy of about 98.25%. Your results will vary from this example.
