Based upon:
With additional input from:
2. https://www.pyimagesearch.com/2017/08/21/deep-learning-with-opencv/
3. https://nbviewer.jupyter.org/github/BVLC/caffe/blob/master/examples/00-classification.ipynb
A) Download Caffe Models
Need the following three files:
- .prototxt file containing model definition
- .caffemodel file containing layer weights
- a synset file containing category labels
and an image file. Just steal one from the internet like everyone else does (or free from www.pexels.com).
Before going any further, create a project folder to contain Caffe model files and Python script. Also save the image file to the project folder.
Download .prototex and .caffemodel files
Link to files:
https://github.com/BVLC/caffe/tree/master/models/bvlc_googlenet
This will open a page looking something like this:

LMB on caffemodel_url to download .caffemodel file. Copy file from Downloads folder to project folder.
LMB on “deploy.prototxt”. This will open a page listing the contents of this file. RMB on [Raw] and select “Save link as . . .” to download the .prototex file. Save the file in the project folder. I renamed it “bvlc_googlenet_deploy.prototxt” to identify it with the model.
Download synset file
The following link will download the tar file containing synset file:
http://dl.caffe.berkeleyvision.org/caffe_ilsvrc12.tar.gz
Move file from Downloads folder to project folder. Then extract it to a separate folder within project folder (tar file contains more than just synset file).
B) Python Code
Create new python script file in project folder. Copy the code below into that file (or paste into IPython, or similar console).
Import modules and print versions:
import numpy as np
import cv2 as cv
import sys # only used to print python version
print('Python: ',sys.version)
print('numPy: ',np.version)
print('OpenCV: ',cv.version)
Output:
Python: 3.7.3 (default, Apr 24 2019, 15:29:51) [MSC v.1915 64 bit (AMD64)]
numPy: 1.16.4
OpenCV: 4.1.1
Define model filenames:
PROTOTXT = 'bvlc_googlenet_deploy.prototxt'
MODEL = 'bvlc_googlenet.caffemodel'
Define confidence level:
CONFIDENCE = .4
This defines the minimum level that constitutes a successful classification.
Load model:
net = cv.dnn.readNetFromCaffe(PROTOTXT, MODEL)
Load class label file and extract labels to array:
rows = open('_labels/synset_words.txt').read().strip().split("\n")
classes = [r[r.find(" ") + 1:].split(",")[0] for r in rows]
Load image file:
img = cv.imread('example.jpg')
Calculate image mean:
imgM = cv.mean(img)
print('image mean ',imgM)
Output (mean-R, mean-G, mean-B, ???) :
image mean (135.6633213531514, 153.53549636533464, 168.6211724333983, 0.0)
Generate 4D blob from image:
blob = cv.dnn.blobFromImage(img , 1.0, (224, 224), imgM)
where the parameters for this command are:
- scale factor = 1.0
- size of output image = (224, 224)
- mean of image to be subtracted
Run the blob though the model:
net.setInput(blob)
detections = net.forward()
Extract predictions:
idxs = np.argsort(detections[0])[::-1][:5]
Loop over the top 5 predictions and print the label and probability:
for (i, idx) in enumerate(idxs):
# print predicted label and probability to console
print("({}) {}, p = {:.5}".format(i + 1,
classes[idx], detections[0][idx]))
# write label and prediction for most likely category on image
if i == 0:
text = "{}, p={:.2f}%".format(classes[idx],
detections[0][idx] * 100)
cv.putText(img, text, (5, 25), cv.FONT_HERSHEY_SIMPLEX,
0.7, (0, 0, 255), 2)
Output:
(1) Labrador retriever, p = 0.51849
(2) Chesapeake Bay retriever, p = 0.35624
(3) German short-haired pointer, p = 0.048037
(4) curly-coated retriever, p = 0.032651
(5) vizsla, p = 0.023717
Display image:
cv.imshow('img', img)
Wait for any key to be pressed, and then close window
cv.waitKey()
cv.destroyAllWindows()
Output:

Code list:
import numpy as np
import cv2 as cv
import sys # only used to print python version
print('Python: ',sys.version)
print('numPy: ',np.__version__)
print('OpenCV: ',cv.__version__)
# define model filenames
PROTOTXT = 'bvlc_googlenet_deploy.prototxt'
MODEL = 'bvlc_googlenet.caffemodel'
# detection parameters
CONFIDENCE = .4
# load model from disk
net = cv.dnn.readNetFromCaffe(PROTOTXT, MODEL)
# Load class label file and extract labels to array
rows = open('_labels/synset_words.txt').read().strip().split("\n")
classes = [r[r.find(" ") + 1:].split(",")[0] for r in rows]
# load image file
img = cv.imread('example.jpg')
# calculate image mean
imgM = cv.mean(img)
print('image mean ',imgM)
# generate 4-D blob from image
# scale factor = 1.0
# size of output image = (224, 224)
# mean of image to be subtracted (mean-R, mean-G, mean-B)
blob = cv.dnn.blobFromImage(img , 1.0, (224, 224), imgM)
# run blob though model
net.setInput(blob)
detections = net.forward()
# Extract predictions
idxs = np.argsort(detections[0])[::-1][:5]
# Loop over the top 5 predictions and print the label and probability
for (i, idx) in enumerate(idxs):
# print predicted label and probability to console
print("({}) {}, p = {:.5}".format(i + 1,
classes[idx], detections[0][idx]))
# write label and prediction for most likely category on image
if i == 0:
text = "{}, p={:.2f}%".format(classes[idx],
detections[0][idx] * 100)
cv.putText(img, text, (5, 25), cv.FONT_HERSHEY_SIMPLEX,
0.7, (0, 0, 255), 2)
# Display image
cv.imshow('img', img)
# wait for any key to be pressed, and then close window
cv.waitKey()
cv.destroyAllWindows()
