Using deep features to build an image classifier¶

Fire up GraphLab Create¶

(See Getting Started with SFrames for setup instructions)

import graphlab

# Limit number of worker processes. This preserves system memory, which prevents hosted notebooks from crashing.
graphlab.set_runtime_config('GRAPHLAB_DEFAULT_NUM_PYLAMBDA_WORKERS', 4)

This non-commercial license of GraphLab Create for academic use is assigned to 870891415@qq.com and will expire on December 06, 2017.

[INFO] graphlab.cython.cy_server: GraphLab Create v2.1 started. Logging: C:\Users\lenovo\AppData\Local\Temp\graphlab_server_1488607906.log.0

Load a common image analysis dataset¶

We will use a popular benchmark dataset in computer vision called CIFAR-10.

(We've reduced the data to just 4 categories = {'cat','bird','automobile','dog'}.)

This dataset is already split into a training set and test set.

image_train = graphlab.SFrame('image_train_data/')
image_test = graphlab.SFrame('image_test_data/')

Exploring the image data¶

graphlab.canvas.set_target('ipynb')

image_train['image'].show()

Train a classifier on the raw image pixels¶

We first start by training a classifier on just the raw pixels of the image.

raw_pixel_model = graphlab.logistic_classifier.create(image_train,target='label',
                                              features=['image_array'])

PROGRESS: Creating a validation set from 5 percent of training data. This may take a while.
          You can set ``validation_set=None`` to disable validation tracking.

WARNING: The number of feature dimensions in this problem is very large in comparison with the number of examples. Unless an appropriate regularization value is set, this model may not provide accurate predictions for a validation/test set.

Logistic regression:

--------------------------------------------------------

Number of examples          : 1909

Number of classes           : 4

Number of feature columns   : 1

Number of unpacked features : 3072

Number of coefficients    : 9219

Starting L-BFGS

Make a prediction with the simple model based on raw pixels¶

image_test[0:3]['image'].show()

image_test[0:3]['label']

dtype: str
Rows: 3
['cat', 'automobile', 'cat']

raw_pixel_model.predict(image_test[0:3])

dtype: str
Rows: 3
['dog', 'dog', 'bird']

The model makes wrong predictions for all three images.

Evaluating raw pixel model on test data¶

raw_pixel_model.evaluate(image_test)

{'accuracy': 0.463, 'auc': 0.6981773749999998, 'confusion_matrix': Columns:
 	target_label	str
 	predicted_label	str
 	count	int
 
 Rows: 16
 
 Data:
 +--------------+-----------------+-------+
 | target_label | predicted_label | count |
 +--------------+-----------------+-------+
 |     cat      |       bird      |  235  |
 |  automobile  |       dog       |  150  |
 |     cat      |       cat       |  272  |
 |     bird     |    automobile   |  129  |
 |  automobile  |    automobile   |  647  |
 |     dog      |    automobile   |  127  |
 |     dog      |       dog       |  435  |
 |     cat      |       dog       |  326  |
 |     bird     |       dog       |  216  |
 |  automobile  |       bird      |   87  |
 +--------------+-----------------+-------+
 [16 rows x 3 columns]
 Note: Only the head of the SFrame is printed.
 You can use print_rows(num_rows=m, num_columns=n) to print more rows and columns., 'f1_score': 0.45762706644397577, 'log_loss': 1.2610972411075254, 'precision': 0.4565216088183009, 'recall': 0.463, 'roc_curve': Columns:
 	threshold	float
 	fpr	float
 	tpr	float
 	p	int
 	n	int
 	class	int
 
 Rows: 400004
 
 Data:
 +-----------+-----+-----+------+------+-------+
 | threshold | fpr | tpr |  p   |  n   | class |
 +-----------+-----+-----+------+------+-------+
 |    0.0    | 1.0 | 1.0 | 1000 | 3000 |   0   |
 |   1e-05   | 1.0 | 1.0 | 1000 | 3000 |   0   |
 |   2e-05   | 1.0 | 1.0 | 1000 | 3000 |   0   |
 |   3e-05   | 1.0 | 1.0 | 1000 | 3000 |   0   |
 |   4e-05   | 1.0 | 1.0 | 1000 | 3000 |   0   |
 |   5e-05   | 1.0 | 1.0 | 1000 | 3000 |   0   |
 |   6e-05   | 1.0 | 1.0 | 1000 | 3000 |   0   |
 |   7e-05   | 1.0 | 1.0 | 1000 | 3000 |   0   |
 |   8e-05   | 1.0 | 1.0 | 1000 | 3000 |   0   |
 |   9e-05   | 1.0 | 1.0 | 1000 | 3000 |   0   |
 +-----------+-----+-----+------+------+-------+
 [400004 rows x 6 columns]
 Note: Only the head of the SFrame is printed.
 You can use print_rows(num_rows=m, num_columns=n) to print more rows and columns.}

The accuracy of this model is poor, getting only about 46% accuracy.

Can we improve the model using deep features¶

We only have 2005 data points, so it is not possible to train a deep neural network effectively with so little data. Instead, we will use transfer learning: using deep features trained on the full ImageNet dataset, we will train a simple model on this small dataset.

len(image_train)

2005

Computing deep features for our images¶

The two lines below allow us to compute deep features. This computation takes a little while, so we have already computed them and saved the results as a column in the data you loaded.

(Note that if you would like to compute such deep features and have a GPU on your machine, you should use the GPU enabled GraphLab Create, which will be significantly faster for this task.)

函数 extract_features()：提取特征

# deep_learning_model = graphlab.load_model('http://s3.amazonaws.com/GraphLab-Datasets/deeplearning/imagenet_model_iter45')
# image_train['deep_features'] = deep_learning_model.extract_features(image_train)

As we can see, the column deep_features already contains the pre-computed deep features for this data.

image_train.head()

Given the deep features, let's train a classifier¶

deep_features_model = graphlab.logistic_classifier.create(image_train,
                                                         features=['deep_features'],
                                                         target='label')

PROGRESS: Creating a validation set from 5 percent of training data. This may take a while.
          You can set ``validation_set=None`` to disable validation tracking.

WARNING: The number of feature dimensions in this problem is very large in comparison with the number of examples. Unless an appropriate regularization value is set, this model may not provide accurate predictions for a validation/test set.

WARNING: Detected extremely low variance for feature(s) 'deep_features' because all entries are nearly the same.
Proceeding with model training using all features. If the model does not provide results of adequate quality, exclude the above mentioned feature(s) from the input dataset.

Logistic regression:

--------------------------------------------------------

Number of examples          : 1911

Number of classes           : 4

Number of feature columns   : 1

Number of unpacked features : 4096

Number of coefficients    : 12291

Apply the deep features model to first few images of test set¶

image_test[0:3]['image'].show()

deep_features_model.predict(image_test[0:3])

dtype: str
Rows: 3
['cat', 'automobile', 'cat']

The classifier with deep features gets all of these images right!

Compute test_data accuracy of deep_features_model¶

As we can see, deep features provide us with significantly better accuracy (about 78%)

deep_features_model.evaluate(image_test)

{'accuracy': 0.78425, 'auc': 0.9388235416666686, 'confusion_matrix': Columns:
 	target_label	str
 	predicted_label	str
 	count	int
 
 Rows: 16
 
 Data:
 +--------------+-----------------+-------+
 | target_label | predicted_label | count |
 +--------------+-----------------+-------+
 |     dog      |    automobile   |   18  |
 |     bird     |       bird      |  795  |
 |  automobile  |       bird      |   27  |
 |     dog      |       bird      |   43  |
 |     cat      |    automobile   |   43  |
 |     cat      |       cat       |  661  |
 |     bird     |       cat       |  116  |
 |     cat      |       dog       |  225  |
 |     dog      |       dog       |  727  |
 |  automobile  |    automobile   |  954  |
 +--------------+-----------------+-------+
 [16 rows x 3 columns]
 Note: Only the head of the SFrame is printed.
 You can use print_rows(num_rows=m, num_columns=n) to print more rows and columns., 'f1_score': 0.7838751425206619, 'log_loss': 0.7418136939315604, 'precision': 0.784201466080599, 'recall': 0.78425, 'roc_curve': Columns:
 	threshold	float
 	fpr	float
 	tpr	float
 	p	int
 	n	int
 	class	int
 
 Rows: 400004
 
 Data:
 +-----------+----------------+-------+------+------+-------+
 | threshold |      fpr       |  tpr  |  p   |  n   | class |
 +-----------+----------------+-------+------+------+-------+
 |    0.0    |      1.0       |  1.0  | 1000 | 3000 |   0   |
 |   1e-05   | 0.612666666667 | 0.998 | 1000 | 3000 |   0   |
 |   2e-05   | 0.558666666667 | 0.998 | 1000 | 3000 |   0   |
 |   3e-05   | 0.524333333333 | 0.998 | 1000 | 3000 |   0   |
 |   4e-05   |     0.495      | 0.998 | 1000 | 3000 |   0   |
 |   5e-05   | 0.472666666667 | 0.998 | 1000 | 3000 |   0   |
 |   6e-05   | 0.454333333333 | 0.997 | 1000 | 3000 |   0   |
 |   7e-05   | 0.440666666667 | 0.997 | 1000 | 3000 |   0   |
 |   8e-05   | 0.427333333333 | 0.997 | 1000 | 3000 |   0   |
 |   9e-05   |     0.418      | 0.997 | 1000 | 3000 |   0   |
 +-----------+----------------+-------+------+------+-------+
 [400004 rows x 6 columns]
 Note: Only the head of the SFrame is printed.
 You can use print_rows(num_rows=m, num_columns=n) to print more rows and columns.}

id	image	label	deep_features	image_array
24	Height: 32 Width: 32	bird	[0.242871761322, 1.09545373917, 0.0, ...	[73.0, 77.0, 58.0, 71.0, 68.0, 50.0, 77.0, 69.0, ...
33	Height: 32 Width: 32	cat	[0.525087952614, 0.0, 0.0, 0.0, 0.0, 0.0, ...	[7.0, 5.0, 8.0, 7.0, 5.0, 8.0, 5.0, 4.0, 6.0, 7.0, ...
36	Height: 32 Width: 32	cat	[0.566015958786, 0.0, 0.0, 0.0, 0.0, 0.0, ...	[169.0, 122.0, 65.0, 131.0, 108.0, 75.0, ...
70	Height: 32 Width: 32	dog	[1.12979578972, 0.0, 0.0, 0.778194487095, 0.0, ...	[154.0, 179.0, 152.0, 159.0, 183.0, 157.0, ...
90	Height: 32 Width: 32	bird	[1.71786928177, 0.0, 0.0, 0.0, 0.0, 0.0, ...	[216.0, 195.0, 180.0, 201.0, 178.0, 160.0, ...
97	Height: 32 Width: 32	automobile	[1.57818555832, 0.0, 0.0, 0.0, 0.0, 0.0, ...	[33.0, 44.0, 27.0, 29.0, 44.0, 31.0, 32.0, 45.0, ...
107	Height: 32 Width: 32	dog	[0.0, 0.0, 0.220677852631, 0.0, ...	[97.0, 51.0, 31.0, 104.0, 58.0, 38.0, 107.0, 61.0, ...
121	Height: 32 Width: 32	bird	[0.0, 0.23753464222, 0.0, 0.0, 0.0, 0.0, ...	[93.0, 96.0, 88.0, 102.0, 106.0, 97.0, 117.0, ...
136	Height: 32 Width: 32	automobile	[0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 7.5737862587, 0.0, ...	[35.0, 59.0, 53.0, 36.0, 56.0, 56.0, 42.0, 62.0, ...
138	Height: 32 Width: 32	bird	[0.658935725689, 0.0, 0.0, 0.0, 0.0, 0.0, ...	[205.0, 193.0, 195.0, 200.0, 187.0, 193.0, ...