TensorFlow (4-3) CNN of MNIST

1. Recall the process of CNN:

在 4-1 我們討論過 CNN 的流程, CNN 的流程如下圖所示:


  1. 輸入圖片
  2. 捲積然後用 relu 輸出響應​
  3. 池化 ​(選擇較重要的資訊)
  4. 重複 2 和 3​
  5. 扁平化並連接神經元​
  6. softmax​ 輸出機率

今天我們要參照這個流程實做 MNIST 的 CNN。


2. CNN Wrapper functions:

定義一些包裝過的函式,這樣在計算時可以提高可讀性,前篇有說明過 max_pool 和 conv2d 的用法。

#!/usr/bin/python3
import tensorflow as tf
import numpy as np
from tensorflow.examples.tutorials.mnist import input_data

def weight_variable(shape):
    initial = tf.truncated_normal(shape, stddev=0.1)
    return tf.Variable(initial)

def bias_variable(shape):
    initial = tf.constant(0.1, shape=shape)
    return tf.Variable(initial)

def conv2d(x, W):
    return tf.nn.conv2d(x, W, strides=[1, 1, 1, 1], padding='SAME')

def max_pool_2x2(x):
    return tf.nn.max_pool(x, ksize=[1, 2, 2, 1],
                   strides=[1, 2, 2, 1], padding='SAME')

def conv_layer(input, shape):
    W = weight_variable(shape)
    b = bias_variable([shape[3]])
    return tf.nn.relu(conv2d(input, W) + b)

def full_layer(input, size):
    in_size = int(input.get_shape()[1])
    W = weight_variable([in_size, size])
    b = bias_variable([size])
    return tf.matmul(input, W) + b

最重要是後面三個函式,我們會重複捲積和池化然後輸出到 full layer。


3. main process:

我們把整個計算分成三部份 computational_process, computational_targets and train:

1. computational_process: x 接收數筆 MNIST 資料,x_image 會把 MNIST 變成 28*28 的正方形圖片,以便之後的捲積。 第一層捲積的 filter 矩陣維度為 5*5*1*32,也就是 32 個 5*5*1 (H*W*D) 的 filter。 因為我們使用 padding='SAME' 模式,經過第一次捲積後輸出的維度為 -1*28*28*32,再經過 max_pool_2x2 運算後後維度為 -1*14*14*32。 第二層捲積的 filter 矩陣為 5*5*32*64,也就是 64 個 5*5*32 (H*W*D) 的 filter,經過 conv_layer 和 max_pool_2x2 運算後後維度為 -1*7*7*64。最後再連接到大小為 1024 的 full layer 上。

2. computational_tagets: 這邊和一般的神經網路差不多,定義了精準度和誤差。比較特別的是這邊選用 AdamOptimizer。

3. train: 訓練也和之前差不多,不過我們為了加速 CNN 的運行定義了 KEEP_PROB 這個參數。KEEP_PROB 是用來避免 over fitting,假設你的 entropy 刪減掉一些神經元時結果仍然差不多,我們就可以考慮丟棄一些神經元。

if __name__== "__main__":
    with tf.name_scope("computational_process"):
        x = tf.placeholder(tf.float32, shape=[None, 784])
        y_true = tf.placeholder(tf.float32, shape=[None, 10])
        x_image = tf.reshape(x, [-1, 28, 28, 1])

        conv1 = conv_layer(x_image, shape=[5, 5, 1, 32])
        conv1_pool = max_pool_2x2(conv1)

        conv2 = conv_layer(conv1_pool, shape=[5, 5, 32, 64])
        conv2_pool = max_pool_2x2(conv2)

        conv2_flat = tf.reshape(conv2_pool, [-1, 7*7*64])

        full_1 = tf.nn.relu(full_layer(conv2_flat, 1024))

        keep_prob = tf.placeholder(tf.float32)
        full1_drop = tf.nn.dropout(full_1, keep_prob=keep_prob)
        y_conv = full_layer(full1_drop, 10)

    with tf.name_scope("computational_targets"):
        DATA_DIR = "data"
        mnist = input_data.read_data_sets(DATA_DIR, one_hot=True)
        cross_entropy = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits_v2(labels=y_true, logits=y_conv))
        train_step = tf.train.AdamOptimizer(1e-4).minimize(cross_entropy)
        correct_prediction = tf.equal(tf.argmax(y_conv, 1), tf.argmax(y_true, 1))
        accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))

    with tf.name_scope("train"):
        STEPS = 300
        BATCH_SIZE = 50
        KEEP_PROB = 0.8
        with tf.Session() as sess:
            sess.run(tf.global_variables_initializer())
            for i in range(STEPS):
                batch = mnist.train.next_batch(BATCH_SIZE)
                sess.run(train_step, feed_dict={x: batch[0], y_true: batch[1], keep_prob: KEEP_PROB})
            
                if i % 30 == 0:
                    train_accuracy = sess.run(accuracy, 
                                              feed_dict={x: batch[0],
                                              y_true: batch[1],
                                              keep_prob: 1.0})
                    print("step {}, training accuracy {}".format(i, train_accuracy))
                
                
            X = mnist.test.images.reshape(10, 1000, 784)
            Y = mnist.test.labels.reshape(10, 1000, 10)
            test_accuracy = np.mean([sess.run(accuracy, 
                                     feed_dict={x:X[i], y_true:Y[i], keep_prob:1.0})
                                     for i in range(10)])
            print ("test accuracy: {}".format(test_accuracy))


Reference

[1] Tom Hope, Yehezkel S. Resheff, and Itay Lieder, Learning TensorFlow A Guide to Building Deep Learning Systems , O'Reilly Media (2017)

[2] Nikhil Buduma, Fundamentals of Deep Learning Designing Next-Generation Machine Intelligence Algorithms, O'Reilly Media (2017)

[3] Aurélien Géron, Hands-On Machine Learning with Scikit-Learn and TensorFlow Concepts, Tools, and Techniques to Build Intelligent Systems , O'Reilly Media (2017)

留言

熱門文章