How to train YOLOv2 to detect custom objects

OS: Ubuntu 16.04
Python Version: 2.7
GPU: 1gb

Installing Darknet

ubuntu:~$ git clone
ubuntu:~$ cd darknet
ubuntu:~$ vi Makefile
ubuntu:~$ make
usage: ./darknet <function>

Yolo Testing

You already have the config file for YOLO in the cfg/ subdirectory. You will have to download the pre-trained weight file here (258 MB). Or just run this:

./darknet detect cfg/yolo.cfg yolo.weights data/dog.jpg
layer     filters    size              input                output
0 conv 32 3 x 3 / 1 416 x 416 x 3 -> 416 x 416 x 32
1 max 2 x 2 / 2 416 x 416 x 32 -> 208 x 208 x 32
29 conv 425 1 x 1 / 1 13 x 13 x1024 -> 13 x 13 x 425
30 detection
Loading weights from yolo.weights...Done!
data/dog.jpg: Predicted in 0.016287 seconds.
car: 54%
bicycle: 51%
dog: 56%

Data annotation

Main thing for creating text file of images (Just understand the following 2 Steps)

  1. Create .txt-file for each .jpg-image-file - in the same directory and with the same name, but with .txt-extension, and put to file: object number and object coordinates on this image, for each object in new line: <object-class> <x> <y> <width> <height>
  • <object-class> - integer number of object from 0 to (classes-1)
  • <x> <y> <width> <height> - float values relative to width and height of image, it can be equal from 0.0 to 1.0
  • for example: <x> = <absolute_x> / <image_width> or <height> = <absolute_height> / <image_height>
  • attention: <x> <y> - are center of rectangle (are not top-left corner)
1 0.716797 0.395833 0.216406 0.147222
0 0.687109 0.379167 0.255469 0.158333
1 0.420312 0.395833 0.140625 0.166667
My Image Size: 360 * 480 it have one object ie:dog
image_width = 360
image_height = 480
absolute_x = 30 (dog x position from image)
absolute_y = 40 (dog y position from image)
absolute_height = 200 (original height of dog from image)
absolute_width = 200 (original width of dog from image)
<class_number> (<absolute_x> / <image_width>) (<absolute_y> / <image_height>) (<absolute_width> / <image_width>) (<absolute_height> / <image_height>)
0 (30/360) (40/480) (200/360) (200/480)
0 0.0833 0.0833 0.556 0.417

YOLO Training Started

Note: Please train this data with one object after successful then start your own or custom data(with multiple object).
I have announced new annotation tool for data annotation. Please find the new tool link for data preparation. If your used new tool, please don’t run the and in this blog and directly move to run Because the two python file process takes single python script. Please read the full description video on new tool blog for how to run the file.

import glob, os# Current directory
current_dir = os.path.dirname(os.path.abspath(__file__))
print(current_dir)current_dir = '<Your Dataset Path>'# Percentage of images to be used for the test set
percentage_test = 10;
# Create and/or truncate train.txt and test.txt
file_train = open('train.txt', 'w')
file_test = open('test.txt', 'w')
# Populate train.txt and test.txt
counter = 1
index_test = round(100 / percentage_test)
for pathAndFilename in glob.iglob(os.path.join(current_dir, "*.jpg")):
title, ext = os.path.splitext(os.path.basename(pathAndFilename))
if counter == index_test:
counter = 1
file_test.write(current_dir + "/" + title + '.jpg' + "\n")
file_train.write(current_dir + "/" + title + '.jpg' + "\n")
counter = counter + 1

Preparing YOLOv2 configuration files

YOLOv2 needs certain specific files to know how and what to train. We’ll be creating these three files. I am using 1GB GPU. So i am used tiny-yolo.cfg:

  • cfg/
  • cfg/obj.names
  • cfg/tiny-yolo.cfg
classes= 1  
train = train.txt
valid = test.txt
names = obj.names
backup = backup/
  • Line 2: set batch=24, this means we will be using 64 images for every training step
  • Line 3: set subdivisions=8, the batch will be divided by 8 to decrease GPU VRAM requirements. If you have a powerful GPU with loads of VRAM, this number can be decreased, or batch could be increased. The training step will throw a CUDA out of memory error so you can adjust accordingly.
  • Line 120: set classes=1, the number of categories we want to detect
  • Line 114: set filters=(classes + 5)*5 in our case filters=30


Time for the fun part! Enter the following command into your terminal and watch your GPU do what it does best (copy your train.txt and test.txt to yolo_darknet folder):

manivannan@manivannan-whirldatascience:~/YoloExample/darknet$ ./darknet detector train cfg/ cfg/yolo-obj.cfg darknet19_448.conv.23


We should now have a .weights file that represents our trained model. Let’s use this on some images to see how well it can detect the NFPA 704 ‘fire diamond’ pictogram. This command unleashes YOLOv2 on an image of our choosing.If you want my trained weights file you try download here.

manivannan@manivannan-whirldatascience:~/YoloExample/darknet$ ./darknet detector test cfg/ cfg/yolo-obj.cfg yolo-obj1000.weights data/manivannan.jpg



Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store