📜  Tensorflow 中的卷积神经网络 (CNN)

📅  最后修改于: 2022-05-13 01:54:53.752000             🧑  作者: Mango

Tensorflow 中的卷积神经网络 (CNN)

假设读者了解神经网络和卷积神经网络的概念。如果您确定主题,请参阅神经网络和卷积神经网络。

CNN 的组成部分:

卷积神经网络主要由三种类型的层组成:

  • 卷积层:它是 CNN 的主要构建块。它输入由一定高度、宽度和通道组成的特征图或输入图像,并通过应用卷积操作将其转换为新的特征图。转换后的特征图由基于 filter_size、padding 和 stride 的不同高度、宽度和通道组成。
Python3
import tensorflow as tf
  
conv_layer = tf.keras.layers.Conv2D(
    filters, kernel_size, strides=(1, 1), padding='valid',
    data_format=None, dilation_rate=(1, 1), groups=1, activation=None,
    use_bias=True, kernel_initializer='glorot_uniform',
    bias_initializer='zeros', kernel_regularizer=None,
    bias_regularizer=None, activity_regularizer=None, kernel_constraint=None,
    bias_constraint=None, **kwargs
)


Python3
import tensorflow as tf
  
max_pooling_layer = tf.keras.layers.MaxPool2D(
    pool_size=(2, 2), strides=None, padding='valid', data_format=None,
    **kwargs
)
  
avg_pooling_layer = tf.keras.layers.AveragePooling2D(
    pool_size=(2, 2), strides=None, padding='valid', data_format=None,
    **kwargs
)


Python3
import tensorflow as tf
  
fully_connected_layer = tf.keras.layers.Dense(
    units, activation=None, use_bias=True,
    kernel_initializer='glorot_uniform',
    bias_initializer='zeros', kernel_regularizer=None,
    bias_regularizer=None, activity_regularizer=None, kernel_constraint=None,
    bias_constraint=None, **kwargs
)


Python3
from tensorflow.keras.datasets import mnist
(X_train, y_train), (X_test, y_test) = mnist.load_data()
print(f'X_train shape: {X_train.shape}, X_test shape: {X_test.shape}')
  
# Plotting random images from dataset
  
# import matplotlib.pyplot as plt 
# import random
# plt.figure(figsize = (12,5))
# for i in range(8):
#   ind = random.randint(0, len(X_train))
#   plt.subplot(240+1+i)
#   plt.imshow(X_train[ind], cmap=True)


Python3
from tensorflow.keras.utils import to_categorical
    
# convert image datatype from integers to floats
X_train = X_train.astype('float32')
X_test = X_test.astype('float32')
    
# normalising piel values
X_train = X_train/255.0
X_test = X_test/255.0
    
# reshape images to add channel dimension
X_train = X_train.reshape(X_train.shape[0], X_train.shape[1], X_train.shape[2], 1)
X_test = X_test.reshape(X_test.shape[0], X_test.shape[1], X_test.shape[2], 1)
    
# One-hot encoding label 
y_train = to_categorical(y_train)
y_test = to_categorical(y_test)


Python3
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Conv2D, MaxPooling2D, Dense, Flatten
  
model = Sequential()
  
# Layer 1
# Conv 1
model.add(Conv2D(filters=6, kernel_size=(5, 5), strides=1, activation = 'relu', input_shape = (32,32,1)))
# Pooling 1
model.add(MaxPooling2D(pool_size=(2, 2), strides = 2))
  
# Layer 2
# Conv 2
model.add(Conv2D(filters=16, kernel_size=(5, 5), strides=1, activation='relu'))
# Pooling 2
model.add(MaxPooling2D(pool_size = 2, strides = 2))
  
# Flatten
model.add(Flatten())
  
# Layer 3
# Fully connected layer 1
model.add(Dense(units=120, activation='relu'))
  
#Layer 4
#Fully connected layer 2
model.add(Dense(units=84, activation='relu'))
  
#Layer 5
#Output Layer
model.add(Dense(units=10, activation='softmax'))
  
model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])


Python3
epochs = 50
batch_size = 512
history = model.fit(X_train, y_train, epochs=epochs, batch_size=batch_size, 
                    steps_per_epoch=X_train.shape[0]//batch_size, 
                    validation_data=(X_test, y_test), 
                    validation_steps=X_test.shape[0]//batch_size, verbose = 1)
  
_, acc = model.evaluate(X_test, y_test, verbose = 1)
print('%.3f' % (acc * 100.0))
  
plt.figure(figsize=(10,6))
plt.plot(history.history['accuracy'], color = 'blue', label = 'train')
plt.plot(history.history['val_accuracy'], color = 'red', label = 'val')
plt.legend()
plt.title('Accuracy')
plt.show()


  • 池化层:池化层执行下采样操作,它减小了输入特征图的大小。池化的两种主要类型是最大池化和平均池化。

Python3

import tensorflow as tf
  
max_pooling_layer = tf.keras.layers.MaxPool2D(
    pool_size=(2, 2), strides=None, padding='valid', data_format=None,
    **kwargs
)
  
avg_pooling_layer = tf.keras.layers.AveragePooling2D(
    pool_size=(2, 2), strides=None, padding='valid', data_format=None,
    **kwargs
)

注:本文主要关注图像数据。每层名称的后缀 2D 对应二维数据(图像)。如果处理不同维度的数据,请参考 TensorFlow API。

  • 全连接层:输出层中的每个节点都直接连接到前一层中的节点。

Python3

import tensorflow as tf
  
fully_connected_layer = tf.keras.layers.Dense(
    units, activation=None, use_bias=True,
    kernel_initializer='glorot_uniform',
    bias_initializer='zeros', kernel_regularizer=None,
    bias_regularizer=None, activity_regularizer=None, kernel_constraint=None,
    bias_constraint=None, **kwargs
)

在这里,我们已经看到了 CNN 的基本构建块,那么现在让我们看看 TensorFlow 中的 CNN 模型的实现。

LeNet 的实现 – 5:

LeNet – 5 是为邮政编码手写数字识别任务而创建的基本卷积神经网络。

  • 加载 MNIST 数据集: MNIST 数据集由 60,000 张训练图像和 10,000 张形状为 28 * 28 的测试图像组成。

Python3

from tensorflow.keras.datasets import mnist
(X_train, y_train), (X_test, y_test) = mnist.load_data()
print(f'X_train shape: {X_train.shape}, X_test shape: {X_test.shape}')
  
# Plotting random images from dataset
  
# import matplotlib.pyplot as plt 
# import random
# plt.figure(figsize = (12,5))
# for i in range(8):
#   ind = random.randint(0, len(X_train))
#   plt.subplot(240+1+i)
#   plt.imshow(X_train[ind], cmap=True)

绘制随机图像

  • 预处理数据:包括将像素值从 0 到 255 归一化为 0 到 1。图像的标签为 0 - 9。它被转换为 one-hot 向量。

Python3

from tensorflow.keras.utils import to_categorical
    
# convert image datatype from integers to floats
X_train = X_train.astype('float32')
X_test = X_test.astype('float32')
    
# normalising piel values
X_train = X_train/255.0
X_test = X_test/255.0
    
# reshape images to add channel dimension
X_train = X_train.reshape(X_train.shape[0], X_train.shape[1], X_train.shape[2], 1)
X_test = X_test.reshape(X_test.shape[0], X_test.shape[1], X_test.shape[2], 1)
    
# One-hot encoding label 
y_train = to_categorical(y_train)
y_test = to_categorical(y_test)
  • 构建 LeNet-5 模型:

Python3

from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Conv2D, MaxPooling2D, Dense, Flatten
  
model = Sequential()
  
# Layer 1
# Conv 1
model.add(Conv2D(filters=6, kernel_size=(5, 5), strides=1, activation = 'relu', input_shape = (32,32,1)))
# Pooling 1
model.add(MaxPooling2D(pool_size=(2, 2), strides = 2))
  
# Layer 2
# Conv 2
model.add(Conv2D(filters=16, kernel_size=(5, 5), strides=1, activation='relu'))
# Pooling 2
model.add(MaxPooling2D(pool_size = 2, strides = 2))
  
# Flatten
model.add(Flatten())
  
# Layer 3
# Fully connected layer 1
model.add(Dense(units=120, activation='relu'))
  
#Layer 4
#Fully connected layer 2
model.add(Dense(units=84, activation='relu'))
  
#Layer 5
#Output Layer
model.add(Dense(units=10, activation='softmax'))
  
model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])

模型摘要

  • 训练CNN模型:

Python3

epochs = 50
batch_size = 512
history = model.fit(X_train, y_train, epochs=epochs, batch_size=batch_size, 
                    steps_per_epoch=X_train.shape[0]//batch_size, 
                    validation_data=(X_test, y_test), 
                    validation_steps=X_test.shape[0]//batch_size, verbose = 1)
  
_, acc = model.evaluate(X_test, y_test, verbose = 1)
print('%.3f' % (acc * 100.0))
  
plt.figure(figsize=(10,6))
plt.plot(history.history['accuracy'], color = 'blue', label = 'train')
plt.plot(history.history['val_accuracy'], color = 'red', label = 'val')
plt.legend()
plt.title('Accuracy')
plt.show()

培训历史

时代与准确性

结果: LeNet-5 模型在 MNIST 数据集上给出了 98.97% 的性能。

上面的任务是图像分类,但是使用 TensorFlow 我们可以执行诸如目标检测图像定位地标检测风格迁移等任务。