There is multiple image classification datasets available online or embedded in python ML related modules, and this notebook contains just a sample code for image classification on those publicly available datasets. In this post, I will just use a very ‘blond’ solution and definitely not a perfect one (deep neural networks – those are much better algorithms for those kind of problems, but they will be covered in a separate post). Idea was to define a common pattern of code for such problems (if something like that exists at all..), learn how to modify/display image datasets and how to apply ML classification algorithms on them. First two examples are based on Python Data Science Handbook by Jake VanderPlas, which in my humble opinion is one of the best sources for studying ML python modules, and what it is really awesome , this book is available it Jupyter notebook format, so you can take a code and experiment with it. Although my code might look different, than the one in this position, it is was based on Jakes book, so all credits go to him, thus I’m including here following comment:
“….. Python Data Science Handbook by Jake VanderPlas; the content is available on GitHub.*
The text is released under the CC-BY-NC-ND license, and code is released under the MIT license. If you find this content useful, please consider supporting the work by buying the book!….”
Third part of my code is a simple solution of a problem introduced in Udacity nanodegree program, but as unfortunately I didn’t participate in this program, training\test datasets were downloaded from German Traffic Sign Dataset. If you want to dive deeper into those 3 problems, I would recommend you to get more info directly from those mentioned above sources.
from sklearn.datasets import load_digits
from sklearn.model_selection import train_test_split
from sklearn.ensemble import RandomForestClassifier
from sklearn import metrics
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
from random import randint
%matplotlib inline
import seaborn as sns
import warnings
warnings.filterwarnings("ignore");
#Lets take a look at this dataset
help (load_digits)
#Check datset size
digits = load_digits()
print(digits.data.shape)
#Each datapoint is a 8x8 image, with pixel value between 0 and 16
digits.images[3]
#Training dataset are flattened image arrays (64X1)
digits.data[3]
#Let's plot sample digit
plt.figure(figsize=(6,3))
plt.matshow(digits.images[3], fignum=1);
# Displaying first 64 images with target values
fig, axes =plt.subplots(8, 8, figsize=(8, 8),
subplot_kw={'xticks':[], 'yticks':[]},
gridspec_kw=dict(hspace=0.1, wspace=0.1));
for i, ax in enumerate(axes.flat):
ax.imshow(digits.images[i], cmap='binary')
ax.text(0, 7, str(digits.target[i]),color='green')
#Splitiing datset on training (80%) and testing (20%) dataset
Xtrain, Xtest, ytrain, ytest = train_test_split(digits.data, digits.target, test_size=0.8, random_state=1000)
Random forest are basically a collection of a number of decision trees and together they are used to give the final output. Like, decision trees random forest is also a supervised learning algorithm which can be used for both regression and classification problems. To get the prediction from a random forest, we use the output from each of the trees which we commonly call as “votes”. The final output is the one which has the most number of votes. (source: https://analyticsdefined.com/introduction-random-forests/)
# Random Forest classifier
model=RandomForestClassifier(n_estimators=1000)
model.fit(Xtrain,ytrain)
ypred=model.predict(Xtest)
Classifier output quality evaluation
# Displaying results. If digit in bottom left corner is green then classification was correct.
#If this digit is red, then classification was wrong.
fig, axes =plt.subplots(8,8, figsize=(8, 8),
subplot_kw={'xticks':[], 'yticks':[]},
gridspec_kw=dict(hspace=0.1, wspace=0.1));
for i, ax in enumerate(axes.flat):
ax.imshow(Xtest[i].reshape((8,8)), cmap='binary')
if ytest[i]== ypred[i]:
ax.text(0, 7, str(ypred[i]),color='green')
else:
ax.text(0, 7, str(ypred[i]),color='red')
# Display calssification report
print(metrics.classification_report(ypred, ytest))
Quick theory reminder:
A true positive (TP) is an outcome where the model correctly predicts the positive class. Similarly, a true negative (TN) is an outcome where the model correctly predicts the negative class.
A false positive (FP)is an outcome where the model incorrectly predicts the positive class. And a false negative (FN) is an outcome where the model incorrectly predicts the negative class.
Precision shows what proportion of positive identifications was actually correct:
Recall shows what proportion of actual positives was identified correctly
$$Recall= \frac{TP}{TP+FN}$$
Accuracy is one metric for evaluating classification models. Informally, accuracy is the fraction of predictions our model got right. Formally, accuracy has the following definition:
$$Accuracy= \frac{Number of correct predictions}{Total number of predictions}$$
so:
$$Accuracy= \frac{TP+TN}{TP+TN+FP+FN}$$
Another metric for evaluating classification models is f1-score
$$f1-score= \frac{2*Precision*Recall}{Precision+Recall}$$
from sklearn.datasets import fetch_lfw_people
#Lets take a look at this dataset
help(fetch_lfw_people)
#Extract only those faces for which we have at least 100 pictures
faces = fetch_lfw_people(min_faces_per_person=100)
print(faces.target_names)
print(faces.images.shape)
#Each datapoint is a 62x47 image, with pixel value between 0 and 255.
faces.images[1]
#Let's plot sample digit
plt.figure(figsize=(6,3))
plt.matshow(faces.images[1], fignum=1);
# Displaying first 16 images with target names
fig, axes =plt.subplots(4, 4, figsize=(8, 12),
subplot_kw={'xticks':[], 'yticks':[]},
gridspec_kw=dict(hspace=0.1, wspace=0.1));
for i, ax in enumerate(axes.flat):
ax.imshow(faces.images[i],'bone')
ax.text(0, 67, faces.target_names[faces.target[i]],color='green')
from sklearn.svm import SVC
from sklearn.decomposition import PCA
from sklearn.pipeline import make_pipeline
In order to detremine number of component lets plot “cumulative explained variance ratio as a function of the number of components”.
pca = PCA().fit(faces.data)
plt.figure(figsize=(10,10))
plt.plot(np.cumsum(pca.explained_variance_ratio_))
plt.xlabel('number of components')
plt.ylabel('cumulative explained variance');
Lets use 200 components, as it contains approximately 95% of the variance
pca = PCA(n_components=200, whiten=True)
svc = SVC(kernel='rbf')
model_pipe = make_pipeline(pca, svc)
from sklearn.model_selection import train_test_split, GridSearchCV
Xtrain, Xtest, ytrain, ytest = train_test_split(faces.data, faces.target, test_size=0.2)
param_grid = {'svc__C': [1, 5, 10, 50],
'svc__gamma': [0.0001, 0.0005, 0.001, 0.005]}
grid = GridSearchCV(model_pipe, param_grid)
grid.fit(Xtrain, ytrain)
print(grid.best_params_)
ypred = grid.best_estimator_.predict(Xtest)
# Displaying results. If name in bottom left corner is green then classification was correct.
#If this name is red, then classification was wrong.
fig, axes =plt.subplots(4,4, figsize=(8, 12),
subplot_kw={'xticks':[], 'yticks':[]},
gridspec_kw=dict(hspace=0.1, wspace=0.1));
for i, ax in enumerate(axes.flat):
ax.imshow(Xtest[i].reshape((62,47)), cmap='bone')
if ytest[i]== ypred[i]:
ax.text(0, 67, faces.target_names[ypred[i]],color='green')
else:
ax.text(0, 67, faces.target_names[ypred[i]],color='red')
from sklearn.metrics import classification_report
print(classification_report(ytest, ypred,
target_names=faces.target_names))
from sklearn.metrics import confusion_matrix
conf_mat = confusion_matrix(ytest, ypred)
sns.heatmap(conf_mat, square=True, annot=True, fmt='d', cbar=True,
xticklabels=faces.target_names,
yticklabels=faces.target_names)
plt.xlabel('Predicted label')
plt.ylabel('True label');
#https://github.com/jeremy-shannon/CarND-Traffic-Sign-Classifier-Project/blob/master/Traffic_Sign_Classifier.md
#import module for working with pickled data
import pickle
Please download data from German Traffic Sign Dataset, as those files are too large to push them to github
training_raw_data=".\\signs\\train.p"
testing_raw_data=".\\signs\\test.p"
label_names =pd.read_csv (".\\signs\\names.csv")
with open(training_raw_data, 'rb') as file:
train = pickle.load(file)
with open(testing_raw_data, 'rb') as file:
test = pickle.load(file)
print(train['features'].shape)
print(train['labels'].shape)
Xtrain=train['features']
ytrain=train['labels']
Xtest=test['features']
ytest=test['labels']
# Function coverting to grayscale and scaling values to 0-1
def rgb2grey(rgb):
#rgb=(0.299 * rgb[:, :, :, 0] + 0.587 * rgb[:, :, :, 1] + 0.114 * rgb[:, :, :, 2])/255.
rgb=(0.299 * rgb[:, :, :, 0] + 0.587 * rgb[:, :, :, 1] + 0.114 * rgb[:, :, :, 2])
return rgb
Xtrain =rgb2grey(Xtrain)
Xtest =rgb2grey(Xtest)
fig, axes =plt.subplots(6,5, figsize=(18, 14),
subplot_kw={'xticks':[], 'yticks':[]},
gridspec_kw=dict(hspace=0.2, wspace=0.1));
for i, ax in enumerate(axes.flat):
rand_num=randint(0, Xtrain.shape[0])
ax.imshow(Xtrain[rand_num], cmap='bone')
ax.text(0, 34, label_names.iloc[train['labels'][rand_num]]['SignName'],color='green')
Xtrain_data=np.resize(Xtrain, (Xtrain.shape[0],Xtrain.shape[1]*Xtrain.shape[2]))
Xtest_data=np.resize(Xtest, (Xtest.shape[0],Xtest.shape[1]*Xtest.shape[2]))
pca = PCA().fit(Xtrain_data)
plt.figure(figsize=(10,10))
plt.plot(np.cumsum(pca.explained_variance_ratio_))
plt.xlabel('number of components')
plt.ylabel('cumulative explained variance');
Lets use 200 components, as it contains approximately 95% of the variance
pca = PCA(n_components=200, whiten=True)
#{'svc__C': 10, 'svc__gamma': 0.001} from grid search
svc = SVC(kernel='rbf', C=10, gamma=0.001)
model_pipe = make_pipeline(pca, svc)
model_pipe.fit(Xtrain_data, ytrain)
ypred = model_pipe.predict(Xtest_data)
# Displaying results. If sign name in bottom left corner is green then classification was correct.
#If this name is red, then classification was wrong.
fig, axes =plt.subplots(6,5, figsize=(18, 14),
subplot_kw={'xticks':[], 'yticks':[]},
gridspec_kw=dict(hspace=0.2, wspace=0.1));
for i, ax in enumerate(axes.flat):
rand_num=randint(0, Xtest.shape[0])
ax.imshow(Xtest[rand_num], cmap='bone')
if ytest[rand_num]== ypred[rand_num]:
ax.text(0, 34, label_names.iloc[ypred[rand_num]]['SignName'],color='green')
else:
ax.text(0, 34, label_names.iloc[ypred[rand_num]]['SignName'],color='red')
from sklearn.metrics import classification_report
print(classification_report(ytest, ypred,
target_names=label_names['SignName']))
I would say, that model accuracy is so so , but honestly I wouldn’t like to seat in autonomous car with this implementation of traffic sign recognition model. I will try better in another post about deep neural networks.