📜  tensorflow conv2d (1)

📅  最后修改于: 2023-12-03 15:05:32.307000             🧑  作者: Mango

TensorFlow Conv2D

Convolutional Neural Networks (ConvNets or CNNs) are a category of Neural Networks that have proven very effective in areas such as image recognition and classification. One of the building blocks of a ConvNet is the Convolutional layer, which applies a set of filters to the input image, producing a set of feature maps.

In TensorFlow, the tf.nn.conv2d() function provides a simple and efficient way to perform 2-dimensional convolution on an input tensor. The function takes the following arguments:

tf.nn.conv2d(input, filters, strides, padding, data_format=None, dilations=None, name=None)
  • input: Input tensor, with shape [batch, in_height, in_width, in_channels]
  • filters: A tensor with shape [filter_height, filter_width, in_channels, out_channels]
  • strides: A list of 4 integers representing the strides for each dimension of input. The dimensions are [batch, height, width, channels]. For example, strides=[1, 1, 1, 1] means the filter is moved 1 pixel at a time in all directions.
  • padding: One of "VALID" or "SAME". "VALID" means no padding is applied. "SAME" means the input is padded with zeros so that the output has the same height and width as the input.
  • data_format: The ordering of the dimensions in the inputs. "NHWC" means the dimensions are in the order of [batch, height, width, channels]. "NCHW" means the dimensions are in the order of [batch, channels, height, width].
  • dilations: A list of 4 integers representing the dilation rates for each dimension of input. The dimensions are [batch, height, width, channels].
  • name: A name for the operation.

The output of the function is a tensor with the same data format as the input. The shape of the output tensor depends on the padding, strides, and dilations arguments.

Example

Suppose we have an input tensor x with shape [1, 28, 28, 3], representing a batch of 1 image with height and width of 28 pixels, and 3 color channels. We want to apply a set of 32 filters to the input, each with a size of 5x5 pixels and a stride of 1 in all dimensions. We can achieve this using the following code:

import tensorflow as tf

x = tf.placeholder(tf.float32, [1, 28, 28, 3])
filters = tf.Variable(tf.random_normal([5, 5, 3, 32]))
conv = tf.nn.conv2d(x, filters, strides=[1, 1, 1, 1], padding="SAME")

with tf.Session() as sess:
    sess.run(tf.global_variables_initializer())
    output = sess.run(conv, feed_dict={x: some_input_data})
    print(output.shape)

Note that we use a placeholder for the input data, since we don't know what it will be in advance. We also initialize the filter tensor using a random normal distribution. Finally, we run the tf.nn.conv2d() function in a session with the input data provided in a feed dictionary. The resulting output tensor has shape [1, 28, 28, 32], representing the 32 feature maps produced by the ConvNet.