How to Train NER with Custom training data using spaCy.

Manivannan Murugavel
3 min readMar 29, 2019

--

This blog explains, how to train and get the named entity from my own training data using spacy and python.

This blog explains, what is spacy and how to get the named entity recognition using spacy. Now I have to train my own training data to identify the entity from the text.

In before I don’t use any annotation tool for annotating the entity from the text. But I have created one tool is called spaCy NER Annotator. The main reason for making this tool is to reduce the annotation time. This tool more helped to annotate the NER. I have a simple dataset to train with 20 lines. It’s based on the product name of an e-commerce site.

Step:1

Step 1 for how to use the ner annotation tool.
The demo video is shown below.

This is the full source code link. You can download and run it.

Step:2

This step explains convert into spacy format. Because the spacy training format is a list of a tuple. But the javascript does not support the tuple data type. So I have used one python script called convert_spacy_train_data.py to convert the final training format. This step already explained the above video. Please skip the step if already done.

Example:

python convert_spacy_train_data.py

OUTPUT:

It also saved the output to the text file(filename train.txt).

Step:3

You can start the training once you completed the second step. Just copy the text and paste into TRAIN_DATA variable in train.py. Run the training…

$ python train.py
Statring iteration 0
{'ner': 45.187162002439436}
Statring iteration 1
{'ner': 3.332452492465983}
Statring iteration 2
{'ner': 2.0020173944797612}
Statring iteration 3
{'ner': 1.506792157187176}
Statring iteration 4
{'ner': 1.435887415739853}
Statring iteration 5
{'ner': 2.6435020729886842}
Statring iteration 6
{'ner': 6.12104525022155}
Statring iteration 7
{'ner': 2.473104735366477}
Statring iteration 8
{'ner': 1.735813616271975}
Statring iteration 9
{'ner': 0.5837462286798504}
Statring iteration 10
{'ner': 0.48013027154284454}
Statring iteration 11
{'ner': 0.010529610710777635}
Statring iteration 12
{'ner': 0.0003886471581916718}
Statring iteration 13
{'ner': 4.343940463793621e-05}
Statring iteration 14
{'ner': 0.0004672196252375103}
Statring iteration 15
{'ner': 8.930835107521379e-05}
Statring iteration 16
{'ner': 1.272837546950617e-06}
Statring iteration 17
{'ner': 0.0003499371851350634}
Statring iteration 18
{'ner': 5.754545232317604e-07}
Statring iteration 19
{'ner': 1.3910001552992408e-07}
Enter your Model Name: myMdl
Enter your testing text: what is the price of pen?
pen 21 24 PrdName

Enter the model name to save and enter text to prediction. Once you saved the trained model you can load the model using

>>> import spacy
>>> nlp = spacy.load('model name')

The full source code available on GitHub.
This is the web URL(if not need Github)

--

--