使用 Tensorflow 进行图像分类
介绍
图像分类是一种刻板问题,最适合用神经网络来解决。这属于感知问题,其中很难定义给定图像属于某个类别而不是其他类别的规则。人脑可以轻松完成这种感知任务,但对于传统的计算机算法来说,解决它却变得极其困难。举个例子,一个两岁的婴儿可以区分狗和猫,但对于传统的计算方法来说,这是一项艰巨的任务。然而,机器学习能够在这个方向上取得长足进步。在本指南中,我们将使用卷积神经网络 (CNN) 在猫和狗的图像上训练神经网络。
数据准备和从目录读取数据
每个机器学习应用程序都需要数据才能运行。处理记录、行或元组的传统 ML(机器学习)任务,用户可以将数据直接读入 NumPy 数组或 Pandas 数据框(对于 Python 生态系统,对于 R 等其他语言可能有所不同)。图像数据无法直接读取并转换为张量。但是,Keras 提供了可以轻松执行此任务的内置方法。以下是从训练和测试目录读取图像数据的代码。
from tensorflow import keras
from keras_preprocessing import image
from keras_preprocessing.image import ImageDataGenerator
train_datagen = ImageDataGenerator(
rescale=1./255,
rotation_range=20,
width_shift_range=0.15,
height_shift_range=0.15,
shear_range=0.15,
zoom_range=0.15,
horizontal_flip=True,
fill_mode='nearest')
test_datagen = ImageDataGenerator(rescale=1./255)
train_generator = train_datagen.flow_from_directory(
"D:\\dogs-vs-cats_train", target_size=(150,150), batch_size=20, class_mode='binary')
validation_generator = test_datagen.flow_from_directory(
"D:\\dogs-vs-cats_validation", target_size=(150,150), batch_size=20,class_mode='binary')
上述代码片段从训练目录 (dogs-vs-cats_train) 和验证目录 (dogs-vs-cats_validation) 中获取图像数据,并通过除以 255 重新调整像素值。它还将所有图像的大小调整为 150X150,而不管训练或验证目录中图像的原始大小如何。还值得注意的是,train_datagen 包含比 test_datagen 更多的参数。这样做是为了执行数据增强。数据增强是基于已经存在的训练数据创建更多训练数据的过程。在上面的代码片段中,现有图像被翻转、剪切、缩放、移动和旋转,以在数据上添加更多训练并避免过度拟合。
卷积神经网络 (CNN)
卷积神经网络与传统的密集网络不同,它在图像识别和计算机视觉方面非常有效。CNN 的一个强大之处在于,它们可以学习屏幕某一部分的模式并将其应用于其他任何地方。这与传统的密集网络形成了鲜明的对比。例如,CNN 可以学习屏幕(图像)右上角猫的耳朵模式,并将其应用于屏幕上新图像的任何其他位置。CNN 的另一个独特之处(除了强大功能外)是,它可以从模式的层次结构中学习。例如,第一层可以学习边缘模式,而后续层可以学习纹理,依此类推。此外,很容易定义用户希望从每个卷积层获得的过滤器数量。过滤器可以被认为是网络可以学习的单独概念。例如,高级概念可以是眼睛、耳朵、腿等。
model = keras.Sequential([
keras.layers.Conv2D(32,(3,3), activation='relu', input_shape=(150,150,3)),
keras.layers.MaxPool2D((2,2)),
keras.layers.Conv2D(64,(3,3),activation='relu'),
keras.layers.MaxPool2D(2,2),
keras.layers.Conv2D(128,(3,3), activation='relu'),
keras.layers.MaxPool2D(2,2),
keras.layers.Conv2D(128,(3,3), activation='relu'),
keras.layers.MaxPool2D(2,2),
keras.layers.Flatten(),
keras.layers.Dropout(0.5),
keras.layers.Dense(512, activation='relu'),
keras.layers.Dense(1, activation='sigmoid')])
model.summary()
第一个 Conv2D 层对 3X3 特征块进行映射,并确定输入上的 32 个过滤器。同样,第二个 Conv2D 层计算 64 个过滤器,第三层 Conv2D 层计算 128 个过滤器。
最大池化操作
通常,CNN 会伴随 Maxpooling 操作。Maxpooling 的主要操作是对特征图进行下采样。Maxpooling 通常在每个卷积层之后进行,它会降低图像的维数。它不是必需的属性,但它是提高模型效率和提高其预测能力的有效方法。下表总结了迄今为止创建的模型。
输出
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
conv2d (Conv2D) (None, 148, 148, 32) 896
_________________________________________________________________
max_pooling2d (MaxPooling2D) (None, 74, 74, 32) 0
_________________________________________________________________
conv2d_1 (Conv2D) (None, 72, 72, 64) 18496
_________________________________________________________________
max_pooling2d_1 (MaxPooling2 (None, 36, 36, 64) 0
_________________________________________________________________
conv2d_2 (Conv2D) (None, 34, 34, 128) 73856
_________________________________________________________________
max_pooling2d_2 (MaxPooling2 (None, 17, 17, 128) 0
_________________________________________________________________
conv2d_3 (Conv2D) (None, 15, 15, 128) 147584
_________________________________________________________________
max_pooling2d_3 (MaxPooling2 (None, 7, 7, 128) 0
_________________________________________________________________
flatten (Flatten) (None, 6272) 0
_________________________________________________________________
dropout (Dropout) (None, 6272) 0
_________________________________________________________________
dense (Dense) (None, 512) 3211776
_________________________________________________________________
dense_1 (Dense) (None, 1) 513
=================================================================
Total params: 3,453,121
Trainable params: 3,453,121
Non-trainable params: 0
模型编译、拟合、绘图和保存模型
模型创建完成后,我们可以继续编译和拟合数据。每个时期产生的输出都存储在历史对象中,该对象稍后用于绘制准确度与时期的图表。这用于确定模型的性能并确保它没有过度拟合。
model.compile(optimizer='adam',
loss='binary_crossentropy',
metrics=['acc'])
history = model.fit_generator(train_generator, steps_per_epoch = 100, epochs=80, validation_data=validation_generator, validation_steps=50)
输出
Epoch 1/80
50/50 [==============================] - 25s 503ms/step - loss: 0.6924 - acc: 0.5100
101/101 [==============================] - 156s 2s/step - loss: 0.7046 - acc: 0.5020 - val_loss: 0.6924 - val_acc: 0.5100
Epoch 2/80
50/50 [==============================] - 21s 414ms/step - loss: 0.6750 - acc: 0.5500
101/101 [==============================] - 147s 1s/step - loss: 0.6915 - acc: 0.5325 - val_loss: 0.6750 - val_acc: 0.5500
Epoch 3/80
50/50 [==============================] - 21s 413ms/step - loss: 0.6895 - acc: 0.5000
101/101 [==============================] - 145s 1s/step - loss: 0.6792 - acc: 0.5500 - val_loss: 0.6895 - val_acc: 0.5000
Epoch 4/80
50/50 [==============================] - 22s 444ms/step - loss: 0.6767 - acc: 0.5940
101/101 [==============================] - 155s 2s/step - loss: 0.6896 - acc: 0.5260 - val_loss: 0.6767 - val_acc: 0.5940
Epoch 5/80
50/50 [==============================] - 19s 388ms/step - loss: 0.6656 - acc: 0.6200
101/101 [==============================] - 154s 2s/step - loss: 0.6709 - acc: 0.5759 - val_loss: 0.6656 - val_acc: 0.6200
上面的输出并不完整,只显示了五个 epoch 的输出轨迹。完整的训练过程需要 80 个 epoch,可能需要一些时间才能完成,具体取决于机器的速度。
以下代码用于创建训练和验证准确度与时期的图表。
acc_train = history.history['acc']
acc_val = history.history['val_acc']
epochs = range(1,81)
plt.plot(epochs,acc_train, 'g', label='training accuracy')
plt.plot(epochs, acc_val, 'b', label= 'validation accuracy')
plt.title('Training and Validation accuracy')
plt.xlabel('Epochs')
plt.ylabel('Accuracy')
plt.legend()
plt.show()
由于每次都拟合模型是不可行的,因此一旦获得了良好的优化模型,最好保存模型,省去再次训练的麻烦。这可以通过以下代码完成。
model.save('cats_and_dogs_small_1.h5')
作为参考,完整代码如下。
from tensorflow import keras
from keras_preprocessing import image
from keras_preprocessing.image import ImageDataGenerator
import matplotlib.pyplot as plt
import os
import cv2
import numpy as np
from os import listdir
from os.path import isfile, join
mypath = 'D:\\ml\\test'
train_datagen = ImageDataGenerator(
rescale=1./255,
rotation_range=40,
width_shift_range=0.15,
height_shift_range=0.15,
shear_range=0.15,
zoom_range=0.15,
horizontal_flip=True,
fill_mode='nearest')
test_datagen = ImageDataGenerator(rescale=1./255)
train_generator = train_datagen.flow_from_directory(
"D:\\dogs-vs-cats_train", target_size=(150,150), batch_size=20, class_mode='binary')
validation_generator = test_datagen.flow_from_directory(
"D:\\dogs-vs-cats_validation", target_size=(150,150), batch_size=20,class_mode='binary')
model = keras.Sequential([
keras.layers.Conv2D(32,(3,3), activation='relu', input_shape=(150,150,3)),
keras.layers.MaxPool2D((2,2)),
keras.layers.Conv2D(64,(3,3),activation='relu'),
keras.layers.MaxPool2D(2,2),
keras.layers.Conv2D(128,(3,3), activation='relu'),
keras.layers.MaxPool2D(2,2),
keras.layers.Conv2D(128,(3,3), activation='relu'),
keras.layers.MaxPool2D(2,2),
keras.layers.Flatten(),
keras.layers.Dropout(0.5),
keras.layers.Dense(512, activation='relu'),
keras.layers.Dense(1, activation='sigmoid')])
model.summary()
model.compile(optimizer='adam',
loss='binary_crossentropy',
metrics=['acc'])
history = model.fit_generator(train_generator, steps_per_epoch = 100, epochs=80, validation_data=validation_generator, validation_steps=50)
model.save('cats_and_dogs_small_1.h5')
acc_train = history.history['acc']
acc_val = history.history['val_acc']
epochs = range(1,81)
plt.plot(epochs,acc_train, 'g', label='training accuracy')
plt.plot(epochs, acc_val, 'b', label= 'validation accuracy')
plt.title('Training and Validation accuracy')
plt.xlabel('Epochs')
plt.ylabel('Accuracy')
plt.legend()
plt.show()
加载模型并进行预测
训练之后,程序会加载之前计算的权重,并使用它们来预测新数据集是猫还是狗。
from tensorflow import keras
from keras_preprocessing import image
from keras_preprocessing.image import ImageDataGenerator
import matplotlib.pyplot as plt
import os
import cv2
import numpy as np
from os import listdir
from os.path import isfile, join
mypath = 'D:\\ml\\test'
train_datagen = ImageDataGenerator(
rescale=1./255,
rotation_range=40,
width_shift_range=0.15,
height_shift_range=0.15,
shear_range=0.15,
zoom_range=0.15,
horizontal_flip=True,
fill_mode='nearest')
test_datagen = ImageDataGenerator(rescale=1./255)
train_generator = train_datagen.flow_from_directory(
"D:\\dogs-vs-cats_train", target_size=(150,150), batch_size=20, class_mode='binary')
validation_generator = test_datagen.flow_from_directory(
"D:\\dogs-vs-cats_validation", target_size=(150,150), batch_size=20,class_mode='binary')
model = keras.Sequential([
keras.layers.Conv2D(32,(3,3), activation='relu', input_shape=(150,150,3)),
keras.layers.MaxPool2D((2,2)),
keras.layers.Conv2D(64,(3,3),activation='relu'),
keras.layers.MaxPool2D(2,2),
keras.layers.Conv2D(128,(3,3), activation='relu'),
keras.layers.MaxPool2D(2,2),
keras.layers.Conv2D(128,(3,3), activation='relu'),
keras.layers.MaxPool2D(2,2),
keras.layers.Flatten(),
keras.layers.Dropout(0.5),
keras.layers.Dense(512, activation='relu'),
keras.layers.Dense(1, activation='sigmoid')])
model.load_weights("D:\\ml\\cats_and_dogs_small_1.h5")
onlyfiles = [f for f in listdir(mypath) if isfile(join(mypath, f))]
for image_file in onlyfiles:
img = image.load_img("D:\\ml\\test\\" + image_file, target_size=(150,150))
x = image.img_to_array(img)
x = x.reshape((1,)+x.shape)
print(model.predict(x))
if model.predict(x) < 0.3:
print(image_file + ": Must be a cat")
if model.predict(x) > 0.7:
print(image_file + ": Must be a dog")
if model.predict(x) > 0.3 and model.predict(x) < 0.7:
print(image_file + ": Not sure if its a cat or a dog")
输出
[1.]]
100.jpg: Must be a dog
[[1.]]
101.jpg: Must be a dog
[[1.]]
102.jpg: Must be a dog
[[0.]]
106.jpg: Must be a cat
[[1.]]
107.jpg: Must be a dog
结论
Image classification is a flagship example of the capability of the Deep Learning technology. A few years back, anything like this was inconceivable even in the realm of Machine Learning. Deep learning is making big strides on things previously considered to be unfathomable. Also, it is to be noted that the Neural Network is a black-boxed approach and practicing it is more of an art than a science. A good optimized model is the work of trial and error and making informed guesses on your hyperparameters and epochs that needs to be run. Setting up the training data is also one of the more fundamental issues in the overall success of the model.
Appendix
I have compiled a few examples of the data set which can be found at my GitHub
免责声明:本内容来源于第三方作者授权、网友推荐或互联网整理,旨在为广大用户提供学习与参考之用。所有文本和图片版权归原创网站或作者本人所有,其观点并不代表本站立场。如有任何版权侵犯或转载不当之情况,请与我们取得联系,我们将尽快进行相关处理与修改。感谢您的理解与支持!
请先 登录后发表评论 ~