Today we will learning about how we can do object detection in pytorch in the most simplest manner as possible. We will be working with the x-ray dataset given in Kaggle website. Every code is being mentioned with comments on it.


In this article, we will understand how we can apply any metrics from scikit learn library in our pytorch code .

So, in our training loop when we do model(input_data_from_dataloader) it will give us regression outputs for our regression model or, classification confidence probabilities for our n number of classification classes . Now if we are training our model in GPU or TPU then we need to detach our predictions and real values both to CPU with the help of this code and change their type to numpy array.

real_targets = real_targets.detach().cpu().numpy()
predicted_targets = predicted_targets.detach().cpu().numpy()


In this article I, am going to define one function which will help the community to save the best model after training a model across all the folds of a dataset .

For this, first we will partition our dataframe into a number of folds of our choice .

from sklearn import model_selection

dataframe["kfold"] = -1 # defining a new column in our dataset
# taking a fraction of data from our dataframe and reset its index
dataframe = dataframe.sample(frac=1).reset_index(drop=True)
# storing the target values in a list
y = dataframe.label.values
# defining the number of required folds we want…


In image classification, while we are going to predict classes for our test set images after training our model, we will generate the confidence probability for each test image for n number of times and finally we will assign the max average value among all the prediction classes to the image. This is called test time augmentation.

To achieve this in PyTorch, first we are going to define the test dataset class like this

class test_Dataset(torch.utils.data.Dataset):
def __init__(self, ids, image_ids):
self.ids = ids
self.image_ids = image_ids # list of testset image ids


In this article , I am going to talk about how we can prepare the sound or, audio dataset so that we can able to classify them . It may be a little long but if you can hold patience and read it thoroughly once, it will be highly beneficial.

  1. First we will load the audio file. From the directory having n number of sounds files, we will try to load 2–3 out of them using torchaudio.load first. torchaudio supports sound files of format ‘.wav’ and ‘.mp3’ which is used to give waveform and sample rate of the sound file…


The following blog will help the readers to understand how we can combine image and tabular data together in PyTorch using deep learning and generate predictions from the model . I will go step by step to understand the whole end to end scenario easily : →

I am going to use here OpenVaccine: COVID-19 mRNA Vaccine Degradation Prediction dataset . After cleaning and processing the tabular data I will create the dataset class for our data as follows……

As there are a lot of codes and syntaxes , I recommend reading the codes with respect to the comments…


In this blog I am going to show practically how we can train and freeze different layers according to our choice in a neural network model . Actually this type of problem arise when we add extra layers to a pre-trained model. So here I am going to demonstrate an example where I will add extra layers to the EfficientNet_b0 Model

I have decided to add extra-layers of a dense layer , then a batch Normalisation layer then a dropout layer and finally two dense layers . So, here goes the model . …


Most of us have used Google Colab with GPU but how many of us have used TPU(Tensor Processing Unit) in Colab???


Most of us find that it is very difficult to add additional layers and generate connections between the model and additional layers . But , here I am going to make it simple . So that , everyone can get benfit out of it . Just read this out once and we will be good to go .

So , here I am going to use the architechture of two small models (EfficientNet_b0 & ResNet18) as our example to understand the topic .


I am going to share some tips and tricks by which we can increase accuracy of our CNN models in deep learning.

These are the following ways by which we can do it: →

  1. Use of Pre-trained Model → First and foremost , we must use a pre-trained model weights as they are generalized in recognizing a large of images. Thereby, their weights learned will help to classify the few number of classes that we will be classifying in our dataset. …

Soumo Chatterjee

Machine learning and Deep Learning Enthusiast | | Mindtree Mind | | Python Lover

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store