How to add additional layers in a pre-trained model using Pytorch
Most of us find that it is very difficult to add additional layers and generate connections between the model and additional layers . But , here I am going to make it simple . So that , everyone can get benfit out of it . Just read this out once and we will be good to go .
So , here I am going to use the architechture of two small models (EfficientNet_b0 & ResNet18) as our example to understand the topic .
EfficientNet_b0 →
First of all we will install the pre-trained model
!pip install efficientnet_pytorch
then if we look in the GitHub of efficientNet of Pytorch we will find import for this
from efficientnet_pytorch import EfficientNet
finally we will define our own class
class EfficientNet_b0(nn.Module):
After that we define the constructor for our class
def __init__(self):
super(EfficientNet_b0, self).__init__()
# where this line super(EfficientNet_b0, self).__init__() is used to inherit nn.Module used above.
After that we will load the Pre-trained EfficientNet Model .
self.model = efficientnet_pytorch.EfficientNet.from_pretrained('efficientnet-b0')
and finally I dediced to add extra-layers of a dense layer , then a batch Normalisation layer then a dropout layer and finally two dense layers .
self.classifier_layer = nn.Sequential(
nn.Linear(1280 , 512),
nn.BatchNorm1d(512),
nn.Dropout(0.2),
nn.Linear(512 , 256),
nn.Linear(256 , no._of_outputs_classes_for_your_dataset)
)
NOTE : nn.Linear(1280 , 512) the first additional dense layer contains 1280 as in_features because if we print the model the last layer (_fc) of efficientnet-b0 model conatains 1280 as in features and 1000 out_features to classify 1000 image classes on which it has been pre-trained.
(_bn1): BatchNorm2d(1280, eps=0.001, momentum=0.010000000000000009, affine=True, track_running_stats=True)
(_avg_pooling): AdaptiveAvgPool2d(output_size=1)
(_dropout): Dropout(p=0.2, inplace=False)
(_fc): Linear(in_features=1280, out_features=1000, bias=True) (_swish): MemoryEfficientSwish()
After this , we will define the forward function to generate the connections between the model and our defined layers . For this we need to look at the forward function of model class in the GitHub Repo
# forward function of Efficient-Net model
def forward(self, inputs):
x = self.extract_features(inputs)
x = self._avg_pooling(x)
x = x.flatten(start_dim=1)
x = self._dropout(x)
x = self._fc(x)
return x
And we will change just a few things by removing the last layer and adding self.model as we have defined self.model in our constructor class .
def forward(self, inputs):
x = self.model.extract_features(inputs)
x = self.model._avg_pooling(x)
x = x.flatten(start_dim=1)
x = self.model._dropout(x)
x = self.classifier_layer(x)
return x
And finally its ready .
RestNet18 →
Similarly for resNet18 model we will just follow the similar steps as we did before .
!pip install pretrainedmodelsimport pretrainedmodels
We will define the model class and the constructor as
class Resnet18(nn.Module):
def __init__(self):
super(Resnet18, self).__init__()
self.model = pretrainedmodels.__dict__['resnet18'](pretrained='imagenet')
and add our extra layers similarly as we did in EfficientNet_b0 model .
self.classifier_layer = nn.Sequential(
nn.Linear(512 , 256),
nn.BatchNorm1d(256),
nn.Dropout(0.2),
nn.Linear(256 , 128),
nn.Linear(128 , no._of_outputs_classes_for_your_dataset)
)
NOTE : nn.Linear(512, 256) the first additional dense layer contains 512 as in_features because if we print the model the last layer (last_linear) of resnet18 model conatains 512 as in features and 1000 out_features to classify 1000 image classes on which it has been pre-trained.
(avgpool): AdaptiveAvgPool2d(output_size=(1, 1))
(fc): None
(last_linear): Linear(in_features=512, out_features=1000, bias=True)
Finally we will define the forward function by looking at the forward function of model class in the GitHub Repo
and changing it to: →
def forward(self, x):
batch_size ,_,_,_ = x.shape #taking out batch_size from input image
x = self.model.features(x)
x = torch.nn.functional.adaptive_avg_pool2d(x,1).reshape(batch_size,-1) # then reshaping the batch_size
x = self.classifier_layer(x)
return x
and after all the changes our class will look like this
After this we are ready to use our model classes by creating their objects . I hope you have understood this , if you have any questions , comments or concerns please let me know in the comment section ; Until then , enjoy learning.