tensorflow_tf.keras.layer.Conv2D, CNN 기초 코딩해보기

티스토리 뷰

인공지능(Artificial Intelligence)/CNN

tensorflow_tf.keras.layer.Conv2D, CNN 기초 코딩해보기

HAN_PY 2020. 10. 13. 10:09

0. 들어가면서

cnn의 기초는 아래의 링크를 따라가자.

han-py.tistory.com/230

지금 글적다가 다 날라갔다... 인내심을 가지고 다시 적어보겠다. 일단 관련 깃헙 파일에 들어가서 tf.keras.layer.Conv2D를 찾아보면 아래와 같다. 사실 이렇게 생겼구나! 만 보고 넘어가면 된다.

class Conv2DTranspose(Conv2D):
  """Transposed convolution layer (sometimes called Deconvolution).
  The need for transposed convolutions generally arises
  from the desire to use a transformation going in the opposite direction
  of a normal convolution, i.e., from something that has the shape of the
  output of some convolution to something that has the shape of its input
  while maintaining a connectivity pattern that is compatible with
  said convolution.
  When using this layer as the first layer in a model,
  provide the keyword argument `input_shape`
  (tuple of integers, does not include the sample axis),
  e.g. `input_shape=(128, 128, 3)` for 128x128 RGB pictures
  in `data_format="channels_last"`.
  Arguments:
    filters: Integer, the dimensionality of the output space
      (i.e. the number of output filters in the convolution).
    kernel_size: An integer or tuple/list of 2 integers, specifying the
      height and width of the 2D convolution window.
      Can be a single integer to specify the same value for
      all spatial dimensions.
    strides: An integer or tuple/list of 2 integers,
      specifying the strides of the convolution along the height and width.
      Can be a single integer to specify the same value for
      all spatial dimensions.
      Specifying any stride value != 1 is incompatible with specifying
      any `dilation_rate` value != 1.
    padding: one of `"valid"` or `"same"` (case-insensitive).
    output_padding: An integer or tuple/list of 2 integers,
      specifying the amount of padding along the height and width
      of the output tensor.
      Can be a single integer to specify the same value for all
      spatial dimensions.
      The amount of output padding along a given dimension must be
      lower than the stride along that same dimension.
      If set to `None` (default), the output shape is inferred.
    data_format: A string,
      one of `channels_last` (default) or `channels_first`.
      The ordering of the dimensions in the inputs.
      `channels_last` corresponds to inputs with shape
      `(batch_size, height, width, channels)` while `channels_first`
      corresponds to inputs with shape
      `(batch_size, channels, height, width)`.
      It defaults to the `image_data_format` value found in your
      Keras config file at `~/.keras/keras.json`.
      If you never set it, then it will be "channels_last".
    dilation_rate: an integer or tuple/list of 2 integers, specifying
      the dilation rate to use for dilated convolution.
      Can be a single integer to specify the same value for
      all spatial dimensions.
      Currently, specifying any `dilation_rate` value != 1 is
      incompatible with specifying any stride value != 1.
    activation: Activation function to use.
      If you don't specify anything, no activation is applied (
      see `keras.activations`).
    use_bias: Boolean, whether the layer uses a bias vector.
    kernel_initializer: Initializer for the `kernel` weights matrix (
      see `keras.initializers`).
    bias_initializer: Initializer for the bias vector (
      see `keras.initializers`).
    kernel_regularizer: Regularizer function applied to
      the `kernel` weights matrix (see `keras.regularizers`).
    bias_regularizer: Regularizer function applied to the bias vector (
      see `keras.regularizers`).
    activity_regularizer: Regularizer function applied to
      the output of the layer (its "activation") (see `keras.regularizers`).
    kernel_constraint: Constraint function applied to the kernel matrix (
      see `keras.constraints`).
    bias_constraint: Constraint function applied to the bias vector (
      see `keras.constraints`).
  Input shape:
    4D tensor with shape:
    `(batch_size, channels, rows, cols)` if data_format='channels_first'
    or 4D tensor with shape:
    `(batch_size, rows, cols, channels)` if data_format='channels_last'.
  Output shape:
    4D tensor with shape:
    `(batch_size, filters, new_rows, new_cols)` if data_format='channels_first'
    or 4D tensor with shape:
    `(batch_size, new_rows, new_cols, filters)` if data_format='channels_last'.
    `rows` and `cols` values might have changed due to padding.
    If `output_padding` is specified:
    ```
    new_rows = ((rows - 1) * strides[0] + kernel_size[0] - 2 * padding[0] +
    output_padding[0])
    new_cols = ((cols - 1) * strides[1] + kernel_size[1] - 2 * padding[1] +
    output_padding[1])
    ```
  Returns:
    A tensor of rank 4 representing
    `activation(conv2dtranspose(inputs, kernel) + bias)`.
  Raises:
    ValueError: if `padding` is "causal".
    ValueError: when both `strides` > 1 and `dilation_rate` > 1.
  References:
    - [A guide to convolution arithmetic for deep
      learning](https://arxiv.org/abs/1603.07285v1)
    - [Deconvolutional
      Networks](https://www.matthewzeiler.com/mattzeiler/deconvolutionalnetworks.pdf)
  """

  def __init__(self,
               filters,
               kernel_size,
               strides=(1, 1),
               padding='valid',
               output_padding=None,
               data_format=None,
               dilation_rate=(1, 1),
               activation=None,
               use_bias=True,
               kernel_initializer='glorot_uniform',
               bias_initializer='zeros',
               kernel_regularizer=None,
               bias_regularizer=None,
               activity_regularizer=None,
               kernel_constraint=None,
               bias_constraint=None,
               **kwargs):

그리고 tensorflow 사이트에 들어가보면 아래와 같이 나와있다.

tf.keras.layers.Conv2DTranspose(
    filters, kernel_size, strides=(1, 1), padding='valid', output_padding=None,
    data_format=None, dilation_rate=(1, 1), activation=None, use_bias=True,
    kernel_initializer='glorot_uniform', bias_initializer='zeros',
    kernel_regularizer=None, bias_regularizer=None, activity_regularizer=None,
    kernel_constraint=None, bias_constraint=None, **kwargs
)

코드에 대해서 하나씩 알아보자.

filters - [Integer] convolution filter의 수이다. 몇개을 쓸 것인가를 나타낸다. 즉, ootput feature의 채널의 나타낸다.,
kernel_size - [integer, tuple, list] convolution filter를 3x3으로 할지 5x5로 할지 정하는 것이다. 즉 3만 써도 되고 튜플 형태로 (3,3)을 써도 되고 [3, 3]으로 써도 된다.
strides - [kernel_size와 동일] 몇칸을 움직일 것인가
paddin - valid를 써도 되고 same을 써도된다.

valid는 padding을 안하는 것이다. same은 strides가 1인경우를 기준으로 했는데 입력과 출력의 size 가 같아지게 만들어서 size가 줄어들지 않게 하여 여러번 반복이 가능하게 만든다.

data_format - channels_last가 디폴트 값이고 (batch, height, width, channels) 순서로 써야한다. channels_first의 경우는 (batch, channels, height, width) 순서로 반드시 써야한다. 즉, 입력 이미지나 인풋 피춰 맵이 들어갈 때도 마찬가지로 이렇게 4차원 tensor를 적어줘야한다.
activation - activation funtion을 넣는 것이다.
use_bias - bias를 쓸것인지.
kernel_initializer, bias_initializer - convolution filter와 bias를 initializer할 때 어떻게 해줄 것인지 정하는 것이다.
kernel_regularizer, bias_regularizer -regularizer 관련해서 기술해 주는 것이다.

우리가 keras.layers같은 하이레벨의 API를 쓸 경우에 실제로 convolution filter는 아래의 순서로 가진다.

kernel dimension : {height, width, in_channel, out_channel}

height, width, in_channel 은 convolution filter의 형태에 관한 것이고 out_channel은 convolution filter의 갯수에 관한 것이다.

1. 코드

1.1 import

import numpy as np
import tensorflow as tf
from tensorflow import keras
import matplotlib.pyplot as plt
print(tf.__version__)
print(keras.__version__)

tf.executing_eagerly()
#tf.enable_eager_execution()

간혹 코드를 보면 tf.enable_eager_execution()을 적어주는 경우가 있는데, tensorflow 버전2 부터는 생략 해도 된다.

#tf.enable_eager_execution()은 쉽게 말해서 중간 계산을 생략하고 결과를 보여주게 만드는 것인데, 버전 1일 떄 필수적으로 쓰다보니, 버전 2부터는 자동으로 실행 되게 했고 tf.enable_eager_execution()을 쓰면 에러가 뜬다. tf.executing_eagerly()를 적으면 관련부분이 켜져 있다고 true가 뜨는 것을 알 수있다.

image를 하나 만들어 보자.

image = tf.constant([[[[1], [2], [3]], 
                      [[4], [5], [6]],
                      [[7], [8], [9]]]], dtype=np.float32)

(1, 3, 3, 1)은 순서데로 배치, height, width, channel이다. 배치는 한장이니까 1이고 채널은 그레이드 스케일이니 1이다.

우리가 쓸 필터를 가상으로 그려보자.

image 1, 3, 3, 1 Filter 2, 2, 1, 1 Stride 1x1 Padding VALID

1 2 3

1 1 12 16

4 5 6 => =>

1 1 24 28

7 8 9

weight = np.array([[[[1.]], [[1.]]],
                    [[[1.]], [[1.]]]])

여기서 weight.shape은 (2, 2, 1, 1)이다.

첫번째가 convolution filter의 weight

두번째가 width

세번쨰가 channel

네번째가 convolution filter의 갯수

1.2 padding - VALID

weight_init = tf.constant_initializer(weight)
conv2d = keras.layers.Conv2D(filters=1, kernel_size=2, padding='VALID',
                             kernel_initializer=weight_init)(image)

initializer는 모양(shape)을 가져와서 텐서를 제공하는 함수이다. 간단히 알아보자.

tf.constant_initializer(value) 제공된 값으로 모든 것을 초기한다
tf.random_uniform_initializer(a, b) [a, b]를 균일하게 초기화 한다
tf.random_normal_initializer(mean, stddev) 주어진 평균 및 표준 편차로 정규 분포에서 초기화한다

1.2 padding - SAME

위의 예는 padding이 VALID이다 이제 padding을 SAME으로 바꿔서 진행해보자!

다른 부분은 다 동일하고 padding 부분만 다르다.

weight_init = tf.constant_initializer(weight)
conv2d = keras.layers.Conv2D(filters=1, kernel_size=2, padding='SAME',
                             kernel_initializer=weight_init)(image)

image 1, 3, 3, 1 Filter 2, 2, 1, 1 Stride 1x1 Padding SAME

1 2 3 0

1 1 12 16 9

4 5 6 0 => =>

1 1 24 28 15

7 8 9 0

15 17 19

0 0 0 0

2. 3Filters (2, 2, 1, 3) height, width, channal, filter 개수

지금까지는 convolution fliter를 하나만 썼는데 여러개 쓰는 방법에 대해 알아보자.

코드를 보자

weight = np.array([[[[1.,10.,-1.]],[[1.,10.,-1.]]],
                   [[[1.,10.,-1.]],[[1.,10.,-1.]]]])
weight_init = tf.constant_initializer(weight)
conv2d = keras.layers.Conv2D(filters=3, kernel_size=2, padding='SAME',
                             kernel_initializer=weight_init)(image)

여기서 weight 부분이 좀 헷갈려서 따로 보면 아래와 같다.

# 3개의 필터 (2, 2, 1, 3)
weight = np.array([[[[1.,10.,-1.]],[[1.,10.,-1.]]],
                   [[[1.,10.,-1.]],[[1.,10.,-1.]]]])

위의 필터가 의미하는 바는 아래와 같다

1 1 10 10 -1 -1

이렇게 3개이다.

feature_maps = np.swapaxes(conv2d, 0, 3)
for i, feature_map in enumerate(feature_maps):
  print(feature_map.reshape(3,3))
  plt.subplot(1,3,i+1), plt.imshow(feature_map.reshape(3,3), cmap='gray')
plt.show()

이제 다음 단계로 CNN의 기초인 Polling을 하러가자.

han-py.tistory.com/239

tensorflow_CNN의 기본연산 Pooling 구현하기_tf.keras.layers.MaxPool2D

0. 들어가면서 합성곱 신경망(CNN)인 Convolution Neural Network의 기본 연산중 하나인 Pooling 연산에 대해 알아보자. 아래의 블로그를 통해 Convolution 연산에 대해 알아 보고 동작 원리도 알아 보았다. han-p

han-py.tistory.com

'인공지능(Artificial Intelligence) > CNN' 카테고리의 다른 글

Convolutional Neural Network(CNN) _기초 개념 (32)	2021.02.22
tensorflow_CNN의 기본연산 Pooling 구현하기_tf.keras.layers.MaxPool2D (0)	2020.10.13
tensorflow_VGG16 코드(이미지 분류) (2)	2020.09.28

공지사항

최근에 올라온 글

최근에 달린 댓글

Total

Today

Yesterday

링크

TAG more

« 2025/07 »
일	월	화	수	목	금	토
		1	2	3	4	5
6	7	8	9	10	11	12
13	14	15	16	17	18	19
20	21	22	23	24	25	26
27	28	29	30	31

글 보관함

내 블로그 - 관리자 홈 전환	`Q` `Q`
새 글 쓰기	`W` `W`

글 수정 (권한 있는 경우)	`E` `E`
댓글 영역으로 이동	`C` `C`

이 페이지의 URL 복사	`S` `S`
맨 위로 이동	`T` `T`
티스토리 홈 이동	`H` `H`
단축키 안내	`Shift` + `/` `⇧` + `/`

AI Platform / Web

티스토리 뷰