[ML] 類神經網路圖像識別 – 將圖片載入 Numpy

初次嘗試利用 Python, Tensorflow 去分析  PCB 電路板上面的蝕刻品質。分析圖片如同首頁的示範圖片:

1. 利用 Jupyter Notebook 將檔案呈現在網頁上

以下的程式碼首先判斷檔案路徑:

import os
from subprocess import check_output
home_path = os.getcwd()
print(home_path)
print(check_output(["ls", home_path]).decode("utf8"))

接著將檔案名稱載入進 onlyfiles 裡面,並且利用 _Imgdis 展示一張圖片:

import os.path, sys
from IPython.display import display
from IPython.display import Image as _Imgdis

folder = os.path.join(home_path, "png")
onlyfiles = []
for f in os.listdir(folder):
    onlyfiles.append(os.path.join(folder,f))
    
display(_Imgdis(filename = onlyfiles[40]))
2. 利用 Jupyter Notebook 將圖片檔案載入進 Numpy 多維度的儲存空間

下一步我們希望能夠將 978 張圖片都載入進 Numpy 的架構中,在載入之前我們必須要知道圖片的大小以準備好相對應的 Numpy 放入 Pixel 資訊。可以利用 linux kernel 中的 file 指令如下:

file 00978.png
00978.png: PNG image data, 1600 x 1200, 8-bit/color RGB, non-interlaced

由此我們可以知道 00978.png 這一個檔案是 1600 x 1200 的像素大小,所以我們必須要準備相對應的 dataset 去儲存這些像素的資訊。

import numpy as np
from time import time
from time import sleep

train_files = []
for _file in onlyfiles:
    train_files.append(_file)
    
image_width = 1600
image_height = 1200
channels = 3
nb_classes = 1
dataset = np.ndarray(shape=(len(train_files), channels, image_height, image_width), dtype=np.float32)
dataset.shape
(978, 3, 1200, 1600)

接著依序載入圖片內容,最後印出第一張圖片的像素數值。

i = 0
for _file in train_files:
    img = load_img(_file)
    img.thumbnail((image_width, image_height))
    x = img_to_array(img)
    x = x.reshape((3, 1200, 1600))
    dataset[i] = x
    
dataset[0,:,:,:]
array([[[178., 177., 174., ..., 192., 192., 188.],
        [191., 191., 185., ..., 191., 186., 192.],
        [191., 185., 191., ..., 155., 129., 159.],
        ...,
        [191., 189., 191., ...,  87.,  85., 103.],
        [ 98.,  96., 104., ...,  84.,  78.,  76.],
        [ 80.,  74.,  72., ..., 144., 127., 150.]],

       [[191., 188., 190., ...,  97.,  98., 106.],
        [ 99., 100., 106., ...,  85.,  75.,  74.],
        [ 73.,  71.,  70., ..., 144., 129., 143.],
        ...,
        [194., 185., 187., ...,  36.,  32.,  40.],
        [ 38.,  34.,  43., ..., 188., 185., 193.],
        [189., 185., 193., ..., 152., 130., 151.]],

       [[189., 185., 190., ...,  32.,  32.,  41.],
        [ 38.,  38.,  46., ..., 190., 188., 189.],
        [190., 189., 190., ..., 151., 129., 148.],
        ...,
        [  0.,   0.,   0., ...,   0.,   0.,   0.],
        [  0.,   0.,   0., ...,   0.,   0.,   0.],
        [  0.,   0.,   0., ...,   0.,   0.,   0.]]], dtype=float32)

參考資料:https://www.kaggle.com/lgmoneda/from-image-files-to-numpy-arrays