Classification in OpenCV Using Pre-built Caffe Model

Based upon:

  1. https://becominghuman.ai/face-detection-with-opencv-and-deep-learning-90b84735f421

With additional input from:

2. https://www.pyimagesearch.com/2017/08/21/deep-learning-with-opencv/

3. https://nbviewer.jupyter.org/github/BVLC/caffe/blob/master/examples/00-classification.ipynb

A) Download Caffe Models

Need the following three files:

  1. .prototxt file containing model definition
  2. .caffemodel file containing layer weights
  3. a synset file containing category labels

and an image file. Just steal one from the internet like everyone else does (or free from www.pexels.com).

Before going any further, create a project folder to contain Caffe model files and Python script. Also save the image file to the project folder.

Download .prototex and .caffemodel files

Link to files:

https://github.com/BVLC/caffe/tree/master/models/bvlc_googlenet

This will open a page looking something like this:

LMB on caffemodel_url to download .caffemodel file. Copy file from Downloads folder to project folder.

LMB on “deploy.prototxt”. This will open a page listing the contents of this file. RMB on [Raw] and select “Save link as . . .” to download the .prototex file. Save the file in the project folder. I renamed it “bvlc_googlenet_deploy.prototxt” to identify it with the model.

Download synset file

The following link will download the tar file containing synset file:

http://dl.caffe.berkeleyvision.org/caffe_ilsvrc12.tar.gz

Move file from Downloads folder to project folder. Then extract it to a separate folder within project folder (tar file contains more than just synset file).

B) Python Code

Create new python script file in project folder. Copy the code below into that file (or paste into IPython, or similar console).

Import modules and print versions:

import numpy as np
import cv2 as cv
import sys # only used to print python version
print('Python: ',sys.version)
print('numPy: ',np.version)
print('OpenCV: ',cv.version)

Output:

Python: 3.7.3 (default, Apr 24 2019, 15:29:51) [MSC v.1915 64 bit (AMD64)]
numPy: 1.16.4
OpenCV: 4.1.1

Define model filenames:

PROTOTXT = 'bvlc_googlenet_deploy.prototxt'
MODEL = 'bvlc_googlenet.caffemodel'

Define confidence level:

CONFIDENCE = .4

This defines the minimum level that constitutes a successful classification.

Load model:

net = cv.dnn.readNetFromCaffe(PROTOTXT, MODEL)

Load class label file and extract labels to array:

rows = open('_labels/synset_words.txt').read().strip().split("\n")
classes = [r[r.find(" ") + 1:].split(",")[0] for r in rows]

Load image file:

img = cv.imread('example.jpg')

Calculate image mean:

imgM = cv.mean(img)
print('image mean ',imgM)

Output (mean-R, mean-G, mean-B, ???) :

image mean (135.6633213531514, 153.53549636533464, 168.6211724333983, 0.0)

Generate 4D blob from image:

blob = cv.dnn.blobFromImage(img , 1.0, (224, 224), imgM)

where the parameters for this command are:

  • scale factor = 1.0
  • size of output image = (224, 224)
  • mean of image to be subtracted

Run the blob though the model:

net.setInput(blob)
detections = net.forward()

Extract predictions:

idxs = np.argsort(detections[0])[::-1][:5]

Loop over the top 5 predictions and print the label and probability:

for (i, idx) in enumerate(idxs):
   # print predicted label and probability to console
   print("({}) {}, p = {:.5}".format(i + 1,
      classes[idx], detections[0][idx]))
   # write label and prediction for most likely category on image
   if i == 0:
      text = "{}, p={:.2f}%".format(classes[idx],
         detections[0][idx] * 100)
      cv.putText(img, text, (5, 25), cv.FONT_HERSHEY_SIMPLEX,
         0.7, (0, 0, 255), 2)

Output:

(1) Labrador retriever, p = 0.51849
(2) Chesapeake Bay retriever, p = 0.35624
(3) German short-haired pointer, p = 0.048037
(4) curly-coated retriever, p = 0.032651
(5) vizsla, p = 0.023717

Display image:

cv.imshow('img', img)

Wait for any key to be pressed, and then close window

cv.waitKey()
cv.destroyAllWindows()

Output:

Code list:

import numpy as np
import cv2 as cv
import sys # only used to print python version
print('Python: ',sys.version)
print('numPy: ',np.__version__)
print('OpenCV: ',cv.__version__)

# define model filenames
PROTOTXT = 'bvlc_googlenet_deploy.prototxt'
MODEL = 'bvlc_googlenet.caffemodel'

# detection parameters
CONFIDENCE = .4

# load model from disk
net = cv.dnn.readNetFromCaffe(PROTOTXT, MODEL)

# Load class label file and extract labels to array
rows = open('_labels/synset_words.txt').read().strip().split("\n")
classes = [r[r.find(" ") + 1:].split(",")[0] for r in rows]

# load image file
img = cv.imread('example.jpg')

# calculate image mean
imgM = cv.mean(img)
print('image mean ',imgM)

# generate 4-D blob from image
# scale factor = 1.0
# size of output image = (224, 224)
# mean of image to be subtracted (mean-R, mean-G, mean-B)
blob = cv.dnn.blobFromImage(img , 1.0, (224, 224), imgM)

# run blob though model
net.setInput(blob)
detections = net.forward()

# Extract predictions
idxs = np.argsort(detections[0])[::-1][:5]

# Loop over the top 5 predictions and print the label and probability
for (i, idx) in enumerate(idxs):
   # print predicted label and probability to console
   print("({}) {}, p = {:.5}".format(i + 1,
      classes[idx], detections[0][idx]))
   # write label and prediction for most likely category on image
   if i == 0:
      text = "{}, p={:.2f}%".format(classes[idx],
         detections[0][idx] * 100)
      cv.putText(img, text, (5, 25), cv.FONT_HERSHEY_SIMPLEX,
         0.7, (0, 0, 255), 2)

# Display image
cv.imshow('img', img)

# wait for any key to be pressed, and then close window
cv.waitKey()
cv.destroyAllWindows()

Leave a comment

Design a site like this with WordPress.com
Get started