📌  相关文章
📜  Inception V2 和 V3 – Inception 网络版本

📅  最后修改于: 2022-05-13 01:58:08.839000             🧑  作者: Mango

Inception V2 和 V3 – Inception 网络版本

Inception V1(或 GoogLeNet)是 2014 年 ILSRVRC 上最先进的架构。它在 ImageNet 分类数据集上产生了创纪录的最低错误,但在某些方面可以进行改进以提高准确性并降低复杂性模型的。
Inception V1架构的问题:
Inception V1 有时会使用5*5等卷积,这会导致输入维度大幅减少。这会导致神经网络使用一些准确度降低。如果输入维度下降得太厉害,神经网络容易丢失信息的原因。
此外,与3×3相比,当我们使用5×5等更大的卷积时,复杂性也会降低。我们可以在因式分解方面走得更远,即我们可以将3×3卷积分成1×3的非对称卷积,然后是3×1卷积。这相当于滑动一个两层网络,其感受野与3×3卷积相同,但比3×3便宜33% 。当输入维度很大但只有当输入大小为mxm (m 介于 12 和 20 之间)时,这种分解对早期层不起作用。根据 Inception V1 架构,辅助分类器提高了网络的收敛性。他们认为,通过将有用的梯度推到较早的层(以减少损失),它可以帮助减少深度网络中梯度消失问题的影响。但是,这篇论文的作者发现,这个分类器在训练的早期并没有很好地提高收敛性。
Inception V2 的架构变化
在 Inception V2 架构中。 5×5卷积被两个3×3卷积代替。这也减少了计算时间,从而提高了计算速度,因为5×5卷积比3×3卷积贵 2.78。因此,使用两个3×3层而不是5×5 可以提高架构的性能。

图1

该架构还将nXn分解转换为1xn和 nx1 分解。正如我们上面讨论的,3×3 卷积可以转换为1×3,然后是 3×1 卷积,与3×3相比,在计算复杂度方面便宜 33%。

图2

为了解决表征瓶颈的问题,模块的特征库被扩展而不是变得更深。这将防止在我们深入研究时造成的信息丢失。

图 3

Inception V3 的架构变化:
Inception V3 类似于并包含 Inception V2 的所有功能,但有以下更改/添加:

  • 使用 RMSprop 优化器。
  • 辅助分类器的全连接层中的批量归一化。
  • 7×7分解卷积的使用
  • 标签平滑正则化:是一种通过估计训练过程中标签丢失的影响来对分类器进行正则化的方法。它可以防止分类器过于自信地预测一个类别。标签平滑的添加使错误率提高了 0.2%。

建筑学:
下面是Inception V2的逐层细节



Inception V2 架构

上述架构采用大小为(299,299,3) 的图像输入。请注意,上述架构中的图 5、6、7 是指本文中的图 1、2、3。
执行:
在本节中,我们将研究 Inception V3 的实现。我们将使用 Keras 应用程序 API 来加载模块 我们将使用 Cats vs Dogs 数据集进行此实现。
代码:导入所需的模块。

python3
import os
import zipfile
import tensorflow as tf
from tensorflow.keras.optimizers import RMSprop
from tensorflow.keras.preprocessing.image import ImageDataGenerator
from tensorflow.keras import layers
from tensorflow.keras import Model
from tensorflow.keras.applications.inception_v3 import InceptionV3
from tensorflow.keras.optimizers import RMSprop


python3
local_zip = '/dataset/cats_and_dogs_filtered.zip'
zip_ref = zipfile.ZipFile(local_zip, 'r')
zip_ref.extractall('/tmp')
zip_ref.close()
 
base_dataset_dir = '/tmp/cats_and_dogs_filtered'
train_dir = os.path.join(base_dataset_dir, 'train')
validation_dir = os.path.join(base_dataset_dir, 'validation')
 
# Directory with our training cat pictures
train_cats = os.path.join(train_dir, 'cats')
 
# Directory with our training dog pictures
train_dogs = os.path.join(train_dir, 'dogs')
 
# Directory with our validation cat pictures
validation_cats = os.path.join(validation_dir, 'cats')
 
# Directory with our validation dog pictures
validation_dogs = os.path.join(validation_dir, 'dogs')


python3
# Set up matplotlib fig, and size it to fit 4x4 pics
import matplotlib.image as mpimg
nrows = 4
ncols = 4
 
fig = plt.gcf()
fig.set_size_inches(ncols*4, nrows*4)
pic_index = 100
train_cat_files = os.listdir( train_cats )
train_dog_files = os.listdir( train_dogs )
 
 
next_cat_img = [os.path.join(train_cats, fname)
                for fname in train_cat_files[ pic_index-8:pic_index]
               ]
 
next_dog_img = [os.path.join(train_dogs, fname)
                for fname in train_dog_files[ pic_index-8:pic_index]
               ]
 
for i, img_path in enumerate(next_cat_img+next_dog_img):
  # Set up subplot; subplot indices start at 1
  sp = plt.subplot(nrows, ncols, i + 1)
  sp.axis('Off') # Don't show axes (or gridlines)
 
  img = mpimg.imread(img_path)
  plt.imshow(img)
 
plt.show()


python3
train_datagen = ImageDataGenerator(rescale = 1./255.,
                                   rotation_range = 50,
                                   width_shift_range = 0.2,
                                   height_shift_range = 0.2,
                                   zoom_range = 0.2,
                                   horizontal_flip = True)
 
test_datagen = ImageDataGenerator( rescale = 1.0/255. )
 
train_generator = train_datagen.flow_from_directory(train_dir,
                                                    batch_size = 20,
                                                    class_mode = 'binary',
                                                    target_size = (150, 150))    
 
validation_generator =  test_datagen.flow_from_directory( validation_dir,
                                                          batch_size  = 20,
                                                          class_mode  = 'binary',
                                                          target_size = (150, 150))


python3
base_model = InceptionV3(input_shape = (150, 150, 3),
                                include_top = False,
                                weights = 'imagenet')
for layer in base_model.layers:
  layer.trainable = False
 
#stop training is model accuracy reached 99%
class myCallback(tf.keras.callbacks.Callback):
  def on_epoch_end(self, epoch, logs={}):
    if(logs.get('acc')>0.99):
      self.model.stop_training = True


python3
# code
x = layers.Flatten()(base_model.output)
x = layers.Dense(1024, activation='relu')(x)
x = layers.Dropout(0.2)(x)                 
x = layers.Dense  (1, activation='sigmoid')(x)          
 
model = Model( base_model.input, x)
 
model.compile(optimizer = RMSprop(lr=0.0001),loss = 'binary_crossentropy',metrics = ['acc'])
callbacks = myCallback()
 
history = model.fit_generator(
            train_generator,
            validation_data = validation_generator,
            steps_per_epoch = 100,
            epochs = 100,
            validation_steps = 50,
            verbose = 2,
            callbacks=[callbacks])


python3
acc = history.history['acc']
val_acc = history.history['val_acc']
loss = history.history['loss']
val_loss = history.history['val_loss']
 
epochs = range(len(acc))
 
plt.plot(epochs, acc, 'r', label='Training accuracy')
plt.plot(epochs, val_acc, 'b', label='Validation accuracy')
plt.title('Training and validation accuracy')
 
plt.figure()
 
plt.plot(epochs, loss, 'r', label='Training Loss')
plt.plot(epochs, val_loss, 'b', label='Validation Loss')
plt.title('Training and validation loss')
plt.legend()


代码:创建目录以准备数据集

蟒蛇3

local_zip = '/dataset/cats_and_dogs_filtered.zip'
zip_ref = zipfile.ZipFile(local_zip, 'r')
zip_ref.extractall('/tmp')
zip_ref.close()
 
base_dataset_dir = '/tmp/cats_and_dogs_filtered'
train_dir = os.path.join(base_dataset_dir, 'train')
validation_dir = os.path.join(base_dataset_dir, 'validation')
 
# Directory with our training cat pictures
train_cats = os.path.join(train_dir, 'cats')
 
# Directory with our training dog pictures
train_dogs = os.path.join(train_dir, 'dogs')
 
# Directory with our validation cat pictures
validation_cats = os.path.join(validation_dir, 'cats')
 
# Directory with our validation dog pictures
validation_dogs = os.path.join(validation_dir, 'dogs')

代码:将数据集存储在上面创建的目录中并绘制一些示例图像。

蟒蛇3

# Set up matplotlib fig, and size it to fit 4x4 pics
import matplotlib.image as mpimg
nrows = 4
ncols = 4
 
fig = plt.gcf()
fig.set_size_inches(ncols*4, nrows*4)
pic_index = 100
train_cat_files = os.listdir( train_cats )
train_dog_files = os.listdir( train_dogs )
 
 
next_cat_img = [os.path.join(train_cats, fname)
                for fname in train_cat_files[ pic_index-8:pic_index]
               ]
 
next_dog_img = [os.path.join(train_dogs, fname)
                for fname in train_dog_files[ pic_index-8:pic_index]
               ]
 
for i, img_path in enumerate(next_cat_img+next_dog_img):
  # Set up subplot; subplot indices start at 1
  sp = plt.subplot(nrows, ncols, i + 1)
  sp.axis('Off') # Don't show axes (or gridlines)
 
  img = mpimg.imread(img_path)
  plt.imshow(img)
 
plt.show()

代码:数据增强以增加数据集中的数据样本。

蟒蛇3

train_datagen = ImageDataGenerator(rescale = 1./255.,
                                   rotation_range = 50,
                                   width_shift_range = 0.2,
                                   height_shift_range = 0.2,
                                   zoom_range = 0.2,
                                   horizontal_flip = True)
 
test_datagen = ImageDataGenerator( rescale = 1.0/255. )
 
train_generator = train_datagen.flow_from_directory(train_dir,
                                                    batch_size = 20,
                                                    class_mode = 'binary',
                                                    target_size = (150, 150))    
 
validation_generator =  test_datagen.flow_from_directory( validation_dir,
                                                          batch_size  = 20,
                                                          class_mode  = 'binary',
                                                          target_size = (150, 150))

代码:使用我们上面导入的 Inception API 和回调函数来定义基础模型来训练模型。

蟒蛇3

base_model = InceptionV3(input_shape = (150, 150, 3),
                                include_top = False,
                                weights = 'imagenet')
for layer in base_model.layers:
  layer.trainable = False
 
#stop training is model accuracy reached 99%
class myCallback(tf.keras.callbacks.Callback):
  def on_epoch_end(self, epoch, logs={}):
    if(logs.get('acc')>0.99):
      self.model.stop_training = True

在这一步中,我们训练我们的模型,但在训练之前,我们需要更改最后一层,使其只能预测一个输出并使用优化器函数进行训练。这里我们使用了学习率为 0.0001 的 RMSprop。我们还在最后一个全连接层之后添加了一个 dropout 0.2。之后,我们将模型训练多达 100 个 epoch。
代码:

蟒蛇3

# code
x = layers.Flatten()(base_model.output)
x = layers.Dense(1024, activation='relu')(x)
x = layers.Dropout(0.2)(x)                 
x = layers.Dense  (1, activation='sigmoid')(x)          
 
model = Model( base_model.input, x)
 
model.compile(optimizer = RMSprop(lr=0.0001),loss = 'binary_crossentropy',metrics = ['acc'])
callbacks = myCallback()
 
history = model.fit_generator(
            train_generator,
            validation_data = validation_generator,
            steps_per_epoch = 100,
            epochs = 100,
            validation_steps = 50,
            verbose = 2,
            callbacks=[callbacks])

代码:绘制训练和验证准确性以及训练和验证损失。

蟒蛇3

acc = history.history['acc']
val_acc = history.history['val_acc']
loss = history.history['loss']
val_loss = history.history['val_loss']
 
epochs = range(len(acc))
 
plt.plot(epochs, acc, 'r', label='Training accuracy')
plt.plot(epochs, val_acc, 'b', label='Validation accuracy')
plt.title('Training and validation accuracy')
 
plt.figure()
 
plt.plot(epochs, loss, 'r', label='Training Loss')
plt.plot(epochs, val_loss, 'b', label='Validation Loss')
plt.title('Training and validation loss')
plt.legend()

结果:
表现最好的 Inception V3 架构在 ILSVRC 2012 分类挑战中报告了单次作物的前 5 错误率仅为 5.6% 和前 1 错误率 21.2%,这是新的最新技术水平。在多种作物(144 种作物)上,它在 ILSVRC 2012 分类基准上报告的 top-5 和 top-1 错误率分别为 4.2% 和 18.77%。



Inception V3 性能

一组 Inception V3 架构报告了 3.46% ILSVRC 2012 验证集的前 5 错误率(ILSVRC 2012 测试集为 3.58%)。

Inception V3 的集成结果

参考资料

  • 重新思考计算机视觉论文的初始架构