YOLO Classification with Custom Dataset

7 min readOct 27, 2022

This blog helps to train the classification model with custom dataset using yolo darknet.

I have dataset for classification and the dataset is cat and dog. I have totally 400 images for cat and dog.

Step 1: Preparing dataset

The cat and dog images have different names of the images. Need to change the image names like <image_name>_<class_name>. Example if you have the cat image like 128362.jpg then change to 128362_cat.jpg. Similarly do the same for dog images.

This is image shown for the cat and dog, similarly you will do the your own classification dataset. But the important thing is end of the name is class name of the image.

If the dataset is ready then create the train.txt file which is include the all image file paths.

=================> train.txt
/home/administrator/dataset/Image-Dataset/Images/001/1111_cat.jpg
/home/administrator/dataset/Image-Dataset/Images/001/1114_cat.jpg
/home/administrator/dataset/Image-Dataset/Images/001/1194_cat.jpg
/home/administrator/dataset/Image-Dataset/Images/001/1081_cat.jpg
/home/administrator/dataset/Image-Dataset/Images/001/1103_cat.jpg
/home/administrator/dataset/Image-Dataset/Images/001/1178_cat.jpg
/home/administrator/dataset/Image-Dataset/Images/001/1102_cat.jpg
/home/administrator/dataset/Image-Dataset/Images/001/1054_cat.jpg
.....
.....
/home/administrator/dataset/Image-Dataset/Images/002/2113_dog.jpg
/home/administrator/dataset/Image-Dataset/Images/002/2059_dog.jpg
/home/administrator/dataset/Image-Dataset/Images/002/2064_dog.jpg
/home/administrator/dataset/Image-Dataset/Images/002/2010_dog.jpg
/home/administrator/dataset/Image-Dataset/Images/002/2037_dog.jpg
/home/administrator/dataset/Image-Dataset/Images/002/2189_dog.jpg
/home/administrator/dataset/Image-Dataset/Images/002/2134_dog.jpg

Step 2: Preparing the Configuration files

Create the .cfg, .data and .labels files. The files are below shown how will create the format of the files.

.cfg

First choose the architecture cfg file, because it’s has more files based on the architecture network. The files are darknet.cfg, darknet19.cfg, darknet19_448.cfg, darknet53.cfg, darknet53_448_xnor.cfg, densenet201.cfg, resnet50.cfg and etc. I am referring the all .cfg files here.

But I choose the darknet19.cfg and no need to add the filters and classcount in cfg file. But if need then change the batch, subdivisions, width, heigth, learning rate and etc. But I changed the batch and subdivisions value in my file.

.labels

Create the .labels file for the name of the classes.

=============> classify.labels
cat
dog

.data

Create the .data file and mention the all paths. Example shown below.

=============> classify.data
classes=2
train  = /home/administrator/dataset/Image-Dataset/Images/train.txt
valid  = /home/administrator/dataset/Image-Dataset/Images/valid.txt
backup = backup
labels = classify/classify.labels

classes is mention the how many classes to be train.
train is mention the train.txt file.
valid is used to validate the model once trained well.
backup is used to store the weights file after every hundred iterations as well as every 1k iterations.
labels is used to store the class names.

Step 3: Start the training

I am using the colab to train the model, because I don’t have the GPU to train the model.

Go to the darknet folder.

./darknet classifier train classify/classify.data classify/darknet19.cfg

Logs is:

58672, 4670.408: 0.017468, 0.133020 avg, 0.086119 rate, 0.526959 seconds, 1877504 images, 149.905127 hours left
Loaded: 0.000048 seconds
58673, 4670.487: 0.092421, 0.128960 avg, 0.086119 rate, 0.500856 seconds, 1877536 images, 149.922960 hours left
Loaded: 0.000053 seconds
58674, 4670.567: 0.241883, 0.140252 avg, 0.086119 rate, 0.509569 seconds, 1877568 images, 149.924490 hours left
Loaded: 0.000041 seconds
58675, 4670.646: 0.328562, 0.159083 avg, 0.086119 rate, 0.526248 seconds, 1877600 images, 150.059135 hours left
Loaded: 0.000061 seconds
58676, 4670.726: 0.159558, 0.159131 avg, 0.086118 rate, 0.510900 seconds, 1877632 images, 149.994646 hours left
Loaded: 0.000063 seconds
58677, 4670.806: 0.139034, 0.157121 avg, 0.086118 rate, 0.514418 seconds, 1877664 images, 150.058497 hours left
Loaded: 0.000036 seconds
58678, 4670.885: 0.096963, 0.151105 avg, 0.086118 rate, 0.509097 seconds, 1877696 images, 150.018027 hours left
Loaded: 0.000061 seconds
58679, 4670.965: 0.202908, 0.156286 avg, 0.086118 rate, 0.524636 seconds, 1877728 images, 150.044602 hours left
Loaded: 0.000042 seconds
58680, 4671.044: 0.107447, 0.151402 avg, 0.086117 rate, 0.507753 seconds, 1877760 images, 150.057304 hours left
Loaded: 0.000039 seconds
58681, 4671.125: 0.170147, 0.153276 avg, 0.086117 rate, 0.528043 seconds, 1877792 images, 150.065846 hours left
Loaded: 0.000050 seconds
58682, 4671.204: 0.058880, 0.143837 avg, 0.086117 rate, 0.525056 seconds, 1877824 images, 150.014908 hours left
Loaded: 0.000056 seconds
58683, 4671.284: 0.118079, 0.141261 avg, 0.086117 rate, 0.509327 seconds, 1877856 images, 150.041185 hours left
Loaded: 0.000047 seconds
58684, 4671.363: 0.209800, 0.148115 avg, 0.086117 rate, 0.509897 seconds, 1877888 images, 149.995527 hours left
Loaded: 0.000045 seconds
58685, 4671.443: 0.034625, 0.136766 avg, 0.086116 rate, 0.486971 seconds, 1877920 images, 149.968449 hours left
Loaded: 0.000038 seconds
58686, 4671.522: 0.170590, 0.140148 avg, 0.086116 rate, 0.515038 seconds, 1877952 images, 149.988260 hours left
Loaded: 0.000042 seconds
58687, 4671.602: 0.116686, 0.137802 avg, 0.086116 rate, 0.526876 seconds, 1877984 images, 150.062340 hours left
Loaded: 0.000050 seconds
58688, 4671.682: 0.161948, 0.140217 avg, 0.086116 rate, 0.515048 seconds, 1878016 images, 149.991134 hours left
Loaded: 0.000073 seconds
58689, 4671.761: 0.174787, 0.143674 avg, 0.086115 rate, 0.504868 seconds, 1878048 images, 149.987172 hours left
Loaded: 0.000063 seconds
58690, 4671.841: 0.122182, 0.141524 avg, 0.086115 rate, 0.510415 seconds, 1878080 images, 150.003298 hours left
Loaded: 0.000050 seconds
58691, 4671.920: 0.165099, 0.143882 avg, 0.086115 rate, 0.529815 seconds, 1878112 images, 150.006565 hours left
Loaded: 0.000047 seconds
58692, 4672.000: 0.149811, 0.144475 avg, 0.086115 rate, 0.517317 seconds, 1878144 images, 150.008731 hours left
Loaded: 0.000045 seconds
58693, 4672.080: 0.033141, 0.133341 avg, 0.086115 rate, 0.517650 seconds, 1878176 images, 149.983763 hours left
Loaded: 0.000060 seconds
58694, 4672.159: 0.119514, 0.131959 avg, 0.086114 rate, 0.495346 seconds, 1878208 images, 149.927714 hours left
Loaded: 0.000048 seconds
58695, 4672.239: 0.075158, 0.126279 avg, 0.086114 rate, 0.525910 seconds, 1878240 images, 149.945994 hours left
Loaded: 0.000054 seconds
58696, 4672.318: 0.045376, 0.118188 avg, 0.086114 rate, 0.519505 seconds, 1878272 images, 149.898621 hours left
Loaded: 0.000054 seconds
58697, 4672.398: 0.051555, 0.111525 avg, 0.086114 rate, 0.529496 seconds, 1878304 images, 149.920542 hours left
Loaded: 0.000044 seconds
58698, 4672.478: 0.151574, 0.115530 avg, 0.086113 rate, 0.483182 seconds, 1878336 images, 149.864569 hours left
Loaded: 0.000056 seconds
58699, 4672.557: 0.111679, 0.115145 avg, 0.086113 rate, 0.528589 seconds, 1878368 images, 149.875756 hours left
Loaded: 0.000055 seconds
58700, 4672.637: 0.126829, 0.116313 avg, 0.086113 rate, 0.510915 seconds, 1878400 images, 149.883499 hours left
Saving weights to backup/darknet19_last.weights
Loaded: 0.000078 seconds
58701, 4672.716: 0.156828, 0.120365 avg, 0.086113 rate, 0.510552 seconds, 1878432 images, 150.746769 hours left
Loaded: 0.000050 seconds
58702, 4672.796: 0.241772, 0.132506 avg, 0.086113 rate, 0.535707 seconds, 1878464 images, 150.833295 hours left
Loaded: 0.000049 seconds
58703, 4672.875: 0.043809, 0.123636 avg, 0.086112 rate, 0.508785 seconds, 1878496 images, 150.865441 hours left
Loaded: 0.000079 seconds
58704, 4672.955: 0.108021, 0.122074 avg, 0.086112 rate, 0.504638 seconds, 1878528 images, 150.803537 hours left
Loaded: 0.000051 seconds

Step 4: Testing the model

./darknet classifier predict classify/classify.data classify/darknet19.cfg darknet19_46000.weights cat.jpg

Output:

CUDA-version: 11020 (11020), cuDNN: 8.1.1, GPU count: 1  
 OpenCV isn't used - data augmentation will be slow 
 0 : compute_capability = 750, cudnn_half = 0, GPU: Tesla T4 
net.optimized_memory = 0 
mini_batch = 1, batch = 1, time_steps = 1, train = 0 
   layer   filters  size/strd(dil)      input                output
   0 Create CUDA-stream - 0 
 Create cudnn-handle 0 
conv     32       3 x 3/ 1    224 x 224 x   3 ->  224 x 224 x  32 0.087 BF
   1 max                2x 2/ 2    224 x 224 x  32 ->  112 x 112 x  32 0.002 BF
   2 conv     64       3 x 3/ 1    112 x 112 x  32 ->  112 x 112 x  64 0.462 BF
   3 max                2x 2/ 2    112 x 112 x  64 ->   56 x  56 x  64 0.001 BF
   4 conv    128       3 x 3/ 1     56 x  56 x  64 ->   56 x  56 x 128 0.462 BF
   5 conv     64       1 x 1/ 1     56 x  56 x 128 ->   56 x  56 x  64 0.051 BF
   6 conv    128       3 x 3/ 1     56 x  56 x  64 ->   56 x  56 x 128 0.462 BF
   7 max                2x 2/ 2     56 x  56 x 128 ->   28 x  28 x 128 0.000 BF
   8 conv    256       3 x 3/ 1     28 x  28 x 128 ->   28 x  28 x 256 0.462 BF
   9 conv    128       1 x 1/ 1     28 x  28 x 256 ->   28 x  28 x 128 0.051 BF
  10 conv    256       3 x 3/ 1     28 x  28 x 128 ->   28 x  28 x 256 0.462 BF
  11 max                2x 2/ 2     28 x  28 x 256 ->   14 x  14 x 256 0.000 BF
  12 conv    512       3 x 3/ 1     14 x  14 x 256 ->   14 x  14 x 512 0.462 BF
  13 conv    256       1 x 1/ 1     14 x  14 x 512 ->   14 x  14 x 256 0.051 BF
  14 conv    512       3 x 3/ 1     14 x  14 x 256 ->   14 x  14 x 512 0.462 BF
  15 conv    256       1 x 1/ 1     14 x  14 x 512 ->   14 x  14 x 256 0.051 BF
  16 conv    512       3 x 3/ 1     14 x  14 x 256 ->   14 x  14 x 512 0.462 BF
  17 max                2x 2/ 2     14 x  14 x 512 ->    7 x   7 x 512 0.000 BF
  18 conv   1024       3 x 3/ 1      7 x   7 x 512 ->    7 x   7 x1024 0.462 BF
  19 conv    512       1 x 1/ 1      7 x   7 x1024 ->    7 x   7 x 512 0.051 BF
  20 conv   1024       3 x 3/ 1      7 x   7 x 512 ->    7 x   7 x1024 0.462 BF
  21 conv    512       1 x 1/ 1      7 x   7 x1024 ->    7 x   7 x 512 0.051 BF
  22 conv   1024       3 x 3/ 1      7 x   7 x 512 ->    7 x   7 x1024 0.462 BF
  23 conv   1000       1 x 1/ 1      7 x   7 x1024 ->    7 x   7 x1000 0.100 BF
  24 avg                             7 x   7 x1000 ->   1000
  25 softmax                                        1000
  26 cost                                           1000
Total BFLOPS 5.585 
avg_outputs = 213742 
 Allocate additional workspace_size = 52.44 MB 
Loading weights from darknet19_46000.weights...
 seen 32, trained: 204800 K-images (3200 Kilo-batches_64) 
Done! Loaded 27 layers from weights-file 

 try to allocate additional workspace_size = 52.44 MB 
 CUDA allocate done! 
 classes = 1000, output in cfg = 0 
224 224
cat.jpg: Predicted in 6.341000 milli-seconds.
cat: 0.997701
dog: 0.075252

All the files are here.

YOLO Classification with Custom Dataset

Step 1: Preparing dataset

Step 2: Preparing the Configuration files

Step 3: Start the training

Step 4: Testing the model

Written by Manivannan Murugavel