Level Up Development

For the past few months the Level Up Development team has put in a large research effort into the computer vision space. Computer vision plays a critical role for organizations as they continue to to try understand their audiences better, automate more and see whether or not marketing campaigns are truly working.

3 Ways to Train a Machine to Detect Objects

Along the way we researched a few of the most successful training techniques and thought it’d be helpful to share our insights. Specifically, we thought a simple list of approaches to training object localization models matched up with their pros and cons could help you understand what technique may be best for your project.

1) Training model from scratch

Designing your own Convolutional Neural Network architecture, starting from scratch (weight matrix of all 1’s) train the model on manually created data sets

Pros

More control over architecture

Cons

Must design and test model architecture (big task, we have done this in the past)
Must manually label training data
Must train model from scratch… takes much much longer than transfer learning (also takes much much more training data)

2) Using pre-trained convolutional neural nets

Examples : faster_rcnn_nas, ssd_inception_v2_coco, ssdlite_mobilenet_v2_coco

Pros

Ready to go, no training done by user. Weight matrix is frozen in a pre-trained state by team who built the model. Speed and accuracy of model is known and the appropriate model for the task can be selected.
Parameters can be adjusted using config files
Easy access to all the latest and greatest vision models (much much better results) https://github.com/tensorflow/models/blob/master/research/object_detection/g3doc/detection_model_zoo.md
Swapping between models for comparison is fast and easy
Easily change between local and Cloud computing (TPU training available for some models)

Cons

Only recognizes the classes its pre-trained on. Most common data set for pre-trained models is COCO (Common Objects in Context) http://cocodataset.org/#home. 90 classes of common objects
Will not recognize new objects without transfer learning.
May be non-optimal for some tasks that only care about few object classes, causes noise, and wastes time looking for unwanted objects

3) Transfer learning

Takes a pre-trained model (with its current weight matrix values) and then starts training from there. Depending on type of model, the label data will be more or less complicated to produce.

Pros

All the benefits of pre-trained models, but you can re-train final layers of the model to only look for the objects you are interested in.
Much less data/faster than training from scratch

Cons

Must manually create training data. Depending on model to be used (localization, attributes, segmentation, etc.) this may be more complicated.

Computer Vision

Computer Vision

3 Ways to Train a Machine to Detect Objects

1) Training model from scratch

Pros

Cons

2) Using pre-trained convolutional neural nets

Pros

Cons

3) Transfer learning

Pros

Cons

Let's create something amazing.