Tensorflow 中的卷积神经网络 (CNN)
假设读者了解神经网络和卷积神经网络的概念。如果您确定主题,请参阅神经网络和卷积神经网络。
CNN 的组成部分:
卷积神经网络主要由三种类型的层组成:
- 卷积层:它是 CNN 的主要构建块。它输入由一定高度、宽度和通道组成的特征图或输入图像,并通过应用卷积操作将其转换为新的特征图。转换后的特征图由基于 filter_size、padding 和 stride 的不同高度、宽度和通道组成。
Python3
import tensorflow as tf
conv_layer = tf.keras.layers.Conv2D(
filters, kernel_size, strides=(1, 1), padding='valid',
data_format=None, dilation_rate=(1, 1), groups=1, activation=None,
use_bias=True, kernel_initializer='glorot_uniform',
bias_initializer='zeros', kernel_regularizer=None,
bias_regularizer=None, activity_regularizer=None, kernel_constraint=None,
bias_constraint=None, **kwargs
)
Python3
import tensorflow as tf
max_pooling_layer = tf.keras.layers.MaxPool2D(
pool_size=(2, 2), strides=None, padding='valid', data_format=None,
**kwargs
)
avg_pooling_layer = tf.keras.layers.AveragePooling2D(
pool_size=(2, 2), strides=None, padding='valid', data_format=None,
**kwargs
)
Python3
import tensorflow as tf
fully_connected_layer = tf.keras.layers.Dense(
units, activation=None, use_bias=True,
kernel_initializer='glorot_uniform',
bias_initializer='zeros', kernel_regularizer=None,
bias_regularizer=None, activity_regularizer=None, kernel_constraint=None,
bias_constraint=None, **kwargs
)
Python3
from tensorflow.keras.datasets import mnist
(X_train, y_train), (X_test, y_test) = mnist.load_data()
print(f'X_train shape: {X_train.shape}, X_test shape: {X_test.shape}')
# Plotting random images from dataset
# import matplotlib.pyplot as plt
# import random
# plt.figure(figsize = (12,5))
# for i in range(8):
# ind = random.randint(0, len(X_train))
# plt.subplot(240+1+i)
# plt.imshow(X_train[ind], cmap=True)
Python3
from tensorflow.keras.utils import to_categorical
# convert image datatype from integers to floats
X_train = X_train.astype('float32')
X_test = X_test.astype('float32')
# normalising piel values
X_train = X_train/255.0
X_test = X_test/255.0
# reshape images to add channel dimension
X_train = X_train.reshape(X_train.shape[0], X_train.shape[1], X_train.shape[2], 1)
X_test = X_test.reshape(X_test.shape[0], X_test.shape[1], X_test.shape[2], 1)
# One-hot encoding label
y_train = to_categorical(y_train)
y_test = to_categorical(y_test)
Python3
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Conv2D, MaxPooling2D, Dense, Flatten
model = Sequential()
# Layer 1
# Conv 1
model.add(Conv2D(filters=6, kernel_size=(5, 5), strides=1, activation = 'relu', input_shape = (32,32,1)))
# Pooling 1
model.add(MaxPooling2D(pool_size=(2, 2), strides = 2))
# Layer 2
# Conv 2
model.add(Conv2D(filters=16, kernel_size=(5, 5), strides=1, activation='relu'))
# Pooling 2
model.add(MaxPooling2D(pool_size = 2, strides = 2))
# Flatten
model.add(Flatten())
# Layer 3
# Fully connected layer 1
model.add(Dense(units=120, activation='relu'))
#Layer 4
#Fully connected layer 2
model.add(Dense(units=84, activation='relu'))
#Layer 5
#Output Layer
model.add(Dense(units=10, activation='softmax'))
model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])
Python3
epochs = 50
batch_size = 512
history = model.fit(X_train, y_train, epochs=epochs, batch_size=batch_size,
steps_per_epoch=X_train.shape[0]//batch_size,
validation_data=(X_test, y_test),
validation_steps=X_test.shape[0]//batch_size, verbose = 1)
_, acc = model.evaluate(X_test, y_test, verbose = 1)
print('%.3f' % (acc * 100.0))
plt.figure(figsize=(10,6))
plt.plot(history.history['accuracy'], color = 'blue', label = 'train')
plt.plot(history.history['val_accuracy'], color = 'red', label = 'val')
plt.legend()
plt.title('Accuracy')
plt.show()
- 池化层:池化层执行下采样操作,它减小了输入特征图的大小。池化的两种主要类型是最大池化和平均池化。
Python3
import tensorflow as tf
max_pooling_layer = tf.keras.layers.MaxPool2D(
pool_size=(2, 2), strides=None, padding='valid', data_format=None,
**kwargs
)
avg_pooling_layer = tf.keras.layers.AveragePooling2D(
pool_size=(2, 2), strides=None, padding='valid', data_format=None,
**kwargs
)
注:本文主要关注图像数据。每层名称的后缀 2D 对应二维数据(图像)。如果处理不同维度的数据,请参考 TensorFlow API。
- 全连接层:输出层中的每个节点都直接连接到前一层中的节点。
Python3
import tensorflow as tf
fully_connected_layer = tf.keras.layers.Dense(
units, activation=None, use_bias=True,
kernel_initializer='glorot_uniform',
bias_initializer='zeros', kernel_regularizer=None,
bias_regularizer=None, activity_regularizer=None, kernel_constraint=None,
bias_constraint=None, **kwargs
)
在这里,我们已经看到了 CNN 的基本构建块,那么现在让我们看看 TensorFlow 中的 CNN 模型的实现。
LeNet 的实现 – 5:
LeNet – 5 是为邮政编码手写数字识别任务而创建的基本卷积神经网络。
- 加载 MNIST 数据集: MNIST 数据集由 60,000 张训练图像和 10,000 张形状为 28 * 28 的测试图像组成。
Python3
from tensorflow.keras.datasets import mnist
(X_train, y_train), (X_test, y_test) = mnist.load_data()
print(f'X_train shape: {X_train.shape}, X_test shape: {X_test.shape}')
# Plotting random images from dataset
# import matplotlib.pyplot as plt
# import random
# plt.figure(figsize = (12,5))
# for i in range(8):
# ind = random.randint(0, len(X_train))
# plt.subplot(240+1+i)
# plt.imshow(X_train[ind], cmap=True)
- 预处理数据:包括将像素值从 0 到 255 归一化为 0 到 1。图像的标签为 0 - 9。它被转换为 one-hot 向量。
Python3
from tensorflow.keras.utils import to_categorical
# convert image datatype from integers to floats
X_train = X_train.astype('float32')
X_test = X_test.astype('float32')
# normalising piel values
X_train = X_train/255.0
X_test = X_test/255.0
# reshape images to add channel dimension
X_train = X_train.reshape(X_train.shape[0], X_train.shape[1], X_train.shape[2], 1)
X_test = X_test.reshape(X_test.shape[0], X_test.shape[1], X_test.shape[2], 1)
# One-hot encoding label
y_train = to_categorical(y_train)
y_test = to_categorical(y_test)
- 构建 LeNet-5 模型:
Python3
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Conv2D, MaxPooling2D, Dense, Flatten
model = Sequential()
# Layer 1
# Conv 1
model.add(Conv2D(filters=6, kernel_size=(5, 5), strides=1, activation = 'relu', input_shape = (32,32,1)))
# Pooling 1
model.add(MaxPooling2D(pool_size=(2, 2), strides = 2))
# Layer 2
# Conv 2
model.add(Conv2D(filters=16, kernel_size=(5, 5), strides=1, activation='relu'))
# Pooling 2
model.add(MaxPooling2D(pool_size = 2, strides = 2))
# Flatten
model.add(Flatten())
# Layer 3
# Fully connected layer 1
model.add(Dense(units=120, activation='relu'))
#Layer 4
#Fully connected layer 2
model.add(Dense(units=84, activation='relu'))
#Layer 5
#Output Layer
model.add(Dense(units=10, activation='softmax'))
model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])
- 训练CNN模型:
Python3
epochs = 50
batch_size = 512
history = model.fit(X_train, y_train, epochs=epochs, batch_size=batch_size,
steps_per_epoch=X_train.shape[0]//batch_size,
validation_data=(X_test, y_test),
validation_steps=X_test.shape[0]//batch_size, verbose = 1)
_, acc = model.evaluate(X_test, y_test, verbose = 1)
print('%.3f' % (acc * 100.0))
plt.figure(figsize=(10,6))
plt.plot(history.history['accuracy'], color = 'blue', label = 'train')
plt.plot(history.history['val_accuracy'], color = 'red', label = 'val')
plt.legend()
plt.title('Accuracy')
plt.show()
结果: LeNet-5 模型在 MNIST 数据集上给出了 98.97% 的性能。
上面的任务是图像分类,但是使用 TensorFlow 我们可以执行诸如目标检测、图像定位、地标检测、风格迁移等任务。