Introduction to fastai v2

Karan Tanwar
Analytics Vidhya
Published in
7 min readJan 27, 2021

--

fastai is a high level framework over Pytorch for training machine learning models and achieving state-of-the-art performance in very few lines of code. fastai is to pytorch, what keras is to tensorflow.

This article is a beginners guide sort of text which might be helpful to someone trying to learn this framework, probably because someone told them it’s the new sexy in ML ;)

As important as this article might be for others, this is really helpful for me too. I see my articles as a self created wiki, or a compilation of all those stack overflow answers that I took more that 20 hours to find.

Writing a blog or explaining the stuff you learnt, to someone is a great way to understand the concepts more thoroughly. Enough chit-chat, let’s dive into it!

Overview

There are many articles on fastai v1. But since the developers have created v2 from ground up, there are very little information on the same, except for the official documentation, which, in my personal experience, is not very much detailed.

Anyways, you can go to their official documentation here.

They have divided the framework into 3 sections: tabular, vision and text. Tabular ML covers the machine learning problems where we have a list of features for each user or device or as we call them technically, samples. Vision covers the problems which have images as the dataset and we might have to do classification, or semantic segmentation or object detection. Text covers the sections where we want to do natural language processing.

In this article, I am going to explain only one component, vision.

Vision: A general pipeline

  1. Import Library
import torch
import fastai
import numpy as np
import matplotlib.pyplot as plt
from pathlib import Path
from PIL import Image
from tqdm import tqdm
from fastai.vision.all import *
from fastai.vision.augment import *
device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
device

This will import all the necessary libraries.

2. Create dataset generator

We can load our dataset in many ways. I will be using dataframe for creadting dataloader.

csv_path = '../input/hackerearth-holiday-season/dataset/train.csv'
train_path = '../input/hackerearth-holiday-season/dataset/train'
test_path = '../input/hackerearth-holiday-season/dataset/test'
df = pd.read_csv(csv_path)
df

We will be modifying our Image column so that it becomes a direct path to image location. For that we will use function pandas.DataFrame.apply() which is very useful to modify the contents of a Data frame.

df['Image'] = df['Image'].apply(lambda x: os.path.join(train_path, x))
df

Now we have our Image column exactly how we want. Great!

For plotting the histogram for class distribution, pandas already gives us a nice command to plot it right away!

df['Class'].value_counts().plot.bar()
Clearly an imbalanced dataset which can be handled in many ways.

Now, we would like to finally create our dataloader. For that, fastai provides the class ImageDataLoaders which has an internal function from_df()

img_size = 128
augmentations = [
Rotate(10, p=0.4, mode='bilinear'),
Brightness(max_lighting=0.3,p=0.5),
Contrast(max_lighting=0.4, p=0.5),
RandomErasing(p=0.3, sl=0.0, sh=0.2, min_aspect=0.3, max_count=1),
Flip(p=0.5),
Zoom(max_zoom=1,p=0.5),
RandomResizedCrop(img_size)
]
dls = ImageDataLoaders.from_df(df=df,
path='.',
valid_pct = 0.2,
bs = 32,
device=device,
num_workers=0,
batch_tfms=augmentations,
item_tfms=Resize(img_size))
dls.show_batch()

fastai v2 allows multiple ways to create a data loaders apart from from_df. We can use ImageDataLoaders.from_folderwhich assumes two sub-directories in directory (mentioned in path parameter), train/ and valid/ by default.

ImageDataLoaders.from_folder(path, train='train', valid='valid', valid_pct=None, seed=None, vocab=None, item_tfms=None, batch_tfms=None, bs=64, val_bs=None, shuffle_train=True, device=None)

There is another way: ImageDataLoaders.from_name_re which is useful if we have labels in image names itself. For example img_001_cat.jpg and img_742_dog.jpg .

ImageDataLoaders.from_name_re(path, fnames, pat, bs=64, val_bs=None, shuffle_train=True, device=None)

There are many other modular ways to create dataloader. You can check that out in more detail, here.

So here we have some new terms. For adding augmentations to our dataset, we will create a list of augmentations such as Rotate, RandomErasing, RandomResizedCrop, Flip etc. They are pretty straightforward and quite literal to understand how each of them work.

For creating dls, we mention the target dataframe, that is, df. valid_pct denotes the percentage split for validation set, on which error rate and loss will be calculated for backward propagation.

Parameter path denotes the directory relative to which image paths are mentioned in dataframe. Since I have appended path in df column, we will simply put ‘.’ in path.

There are batch_tfms and item_tfms. item_tfms denotes the augmentation that is required for all images in training and valid set. batch_tfms denotes the augmentations that should be applied to batch on which we are training our model. That means batch_tfms is only for training data and item_tfms is for all images. (Thus we don’t need to add Resize() in augmentations defined above)

My understanding for writing these two separate augmentations:

item_tfms happens first, followed by batch_tfms. This enables most of the calculations for the transform to happen on the GPU, thus saving time.

The first step (item_tfms) resizes all images to same size (happens on CPU) and then batch_tfms happens on GPU. If there were only one transformation, we would have to apply them on CPU, which is slower.

output of dls.show_batch( )

3. Learner

learn = cnn_learner(dls, 
resnet34,
metrics=[accuracy,error_rate])
learn.lr_find()

cnn_learner will return a model specified (resnet34 in this case). Metrics will be used to print the relevant information after each epoch. accuracy metric will show loss after each epoch. error_rate will show, well error rate, pretty self explanatory.

learn.lr_find()

lr_find is a very good way for choosing an appropriate learning rate. Thumb of rule would be choose the lr=min_lr/10 where min_lr is the learning rate at which loss was minimum. This is because choosing min_lr might result in a slightly large lr which might diverge your loss after some epochs. For example, in this plot, I’d choose learning rate at 1e-2.

Now we fit the model using command:

learn.fit_one_cycle(epochs, lr, wd)

fastai v2 has another function called learn.fit() which has the same parameters but it will fit with a fixed learning rate mentioned by the user. learn.fit_one_cycle() will use a cyclic lr type of scheduler with maximum learning rate as lr mentioned by the user. wd denotes the amount of weight decay. fastai encourages wd=1e-2. This is a classic method to reduce overfitting by introducing regularization. With vanilla SGD the weights will be corrected as:

w = w - lr * w.grad - lr * wd * w

You can also unfreeze the model by:

learn.unfreeze() # Set requires_grad=True for all layers

which means that the parameters of the base model are now also trainable. Developers should be careful because after unfreezing the model, one should use very low learning rate. Unfreezing the model is only for fine tuning the model.

One can also set different learning rates to different parts of model after unfreezing.

learn.unfreeze()# deepest layers will have lr=1e-5, middle layers will have lr=1e-4 and layers at the beginning will have lr=5e-4
learning_rate = [1e-5, 1e-4, 5e-4]
learn.fit(learning_rate, epochs=3)

4. Plots

After training we can see some useful plots.

4.a For plotting loss curve of train and validation, write learn.recorder.plot_loss()

learn.recorder.plot_loss()

PS: learn.recorder.plot_metrics() was used in fastai v1 but is depreciated now.

4.b Plot confusion matrix

interp = ClassificationInterpretation.from_learner(learn)losses,idxs = interp.top_losses()interp.plot_confusion_matrix(figsize=(7, 7), dpi=100)

4.c Top loss images

interp.plot_top_losses(4, figsize=(10,11))

This command shows the images that had the highest loss in training. This is pretty useful for debugging purposes.

So this was a little crash course/wiki type of article. I hope you liked it. If you see any decrepancies or incorrect statements, do let me know in the comments or linkedin. Also, if you’d like me to add some other aspects of fastai (such as tabular and text in fastai v2) or any other deep learning concepts, let me know. Peace out ✌️

--

--