我最近训练了一个二进制图像分类器,并最终得到了一个准确率约为97.8%的模型。我通过遵循几个官方Tensorflow指南创建了这个分类器,即:
- https://www.tensorflow.org/tutorials/images/classification
- https://www.tensorflow.org/tutorials/load_data/images
在训练(在GTX 1080上)时,我注意到每个纪元大约需要30秒才能运行。进一步的阅读显示,将数据加载到Tensorflow训练运行中的更好方法是使用数据集。因此,我更新了代码以将图像加载到数据集中,然后通过model.fit_generator
方法读取它们。
现在,当我进行培训时,我发现我的准确性和损失指标是静态的-即使学习率会随着时间而自动变化。输出看起来像这样:
loss: 7.7125 - acc: 0.5000 - val_loss: 7.7125 - val_acc: 0.5000
鉴于我正在训练二进制分类器,其准确度为50%与猜测相同,所以我想知道我提供图像的方式或数据集大小是否存在问题
我的图像数据被这样分割:
training/
true/ (366 images)
false/ (354 images)
validation/
true/ (175 images)
false/ (885 images)
我以前使用ImageDataGenerator
进行各种突变,因此增加了整体数据集。我的数据集大小有问题吗?
我正在使用的应用程序代码如下:
import math
import tensorflow as tf
import os
from tensorflow.keras.optimizers import RMSprop
from tensorflow.keras.callbacks import EarlyStopping
import helpers
import settings
AUTOTUNE = tf.data.experimental.AUTOTUNE
assert tf.test.is_built_with_cuda()
assert tf.test.is_gpu_available()
# Collect the list of training files and process their paths.
training_dataset_files = tf.data.Dataset.list_files(os.path.join(settings.TRAINING_DIRECTORY,'*','*.png'))
training_dataset_labelled = training_dataset_files.map(helpers.process_path,num_parallel_calls=AUTOTUNE)
training_dataset = helpers.prepare_for_training(training_dataset_labelled)
# Collect the validation files.
validation_dataset_files = tf.data.Dataset.list_files(os.path.join(settings.VALIDATION_DIRECTORY,'*.png'))
validation_dataset_labelled = validation_dataset_files.map(helpers.process_path,num_parallel_calls=AUTOTUNE)
validation_dataset = helpers.prepare_for_training(validation_dataset_labelled)
model = tf.keras.models.Sequential([
# This is the first convolution
tf.keras.layers.Conv2D(16,(3,3),activation='relu',input_shape=(settings.TARGET_IMAGE_HEIGHT,settings.TARGET_IMAGE_WIDTH,3)),tf.keras.layers.MaxPooling2D(2,2),# The second convolution
tf.keras.layers.Conv2D(32,activation='relu'),# The third convolution
tf.keras.layers.Conv2D(64,# The fourth convolution
tf.keras.layers.Conv2D(64,# The fifth convolution
tf.keras.layers.Conv2D(64,# flatten the results to feed into a DNN
tf.keras.layers.flatten(),# 512 neuron hidden layer
tf.keras.layers.Dense(512,# Only 1 output neuron. It will contain a value from 0-1 where 0 for 1 class ('false') and 1 for the other ('true')
tf.keras.layers.Dense(1,activation='sigmoid')
])
model.summary()
model.compile(
loss='binary_crossentropy',optimizer=RMSprop(lr=0.1),metrics=['acc']
)
callbacks = [
# EarlyStopping(patience=4),tf.keras.callbacks.ReduceLROnPlateau(
monitor='val_acc',patience=2,verbose=1,factor=0.5,min_lr=0.00001
),tf.keras.callbacks.ModelCheckpoint(
# Path where to save the model
filepath=settings.CHECKPOINT_FILE,# The two parameters below mean that we will overwrite
# the current checkpoint if and only if
# the `val_loss` score has improved.
save_best_only=True,monitor='val_loss',verbose=1
),tf.keras.callbacks.TensorBoard(
log_dir=settings.LOG_DIRECTORY,histogram_freq=1
)
]
training_dataset_length = tf.data.experimental.cardinality(training_dataset_files).numpy()
steps_per_epoch = math.ceil(training_dataset_length // settings.TRAINING_BATCH_SIZE)
validation_dataset_length = tf.data.experimental.cardinality(validation_dataset_files).numpy()
validation_steps = math.ceil(validation_dataset_length // settings.VALIDATION_BATCH_SIZE)
history = model.fit_generator(
training_dataset,steps_per_epoch=steps_per_epoch,epochs=20000,validation_data=validation_dataset,validation_steps=validation_steps,callbacks=callbacks,)
model.save(settings.FULL_MODEL_FILE)
helpers.py
如下所示:
import tensorflow as tf
import settings
AUTOTUNE = tf.data.experimental.AUTOTUNE
def process_path(file_path):
parts = tf.strings.split(file_path,'\\')
label = parts[-2] == settings.CLASS_NAMES
# Read the file and decode the image
img = tf.io.read_file(file_path)
img = tf.image.decode_png(img,channels=3)
img = tf.image.convert_image_dtype(img,tf.float32)
img = tf.image.resize(img,[settings.TARGET_IMAGE_HEIGHT,settings.TARGET_IMAGE_WIDTH])
return img,label
def prepare_for_training(ds,cache=True,shuffle_buffer_size=10000):
if cache:
if isinstance(cache,str):
ds = ds.cache(cache)
else:
ds = ds.cache()
ds = ds.shuffle(buffer_size=shuffle_buffer_size)
ds = ds.repeat()
ds = ds.batch(settings.TRAINING_BATCH_SIZE)
ds = ds.prefetch(buffer_size=AUTOTUNE)
return ds
更大的应用程序输出片段如下:
21/22 [===========================>..] - eta: 0s - loss: 7.7125 - acc: 0.5000
Epoch 00207: val_loss did not improve from 7.71247
22/22 [==============================] - 5s 247ms/step - loss: 7.7125 - acc: 0.5000 - val_loss: 7.7125 - val_acc: 0.5000
Epoch 208/20000
21/22 [===========================>..] - eta: 0s - loss: 7.7125 - acc: 0.5000
Epoch 00208: val_loss did not improve from 7.71247
22/22 [==============================] - 5s 248ms/step - loss: 7.7125 - acc: 0.5000 - val_loss: 7.7125 - val_acc: 0.5000
Epoch 209/20000
21/22 [===========================>..] - eta: 0s - loss: 7.7125 - acc: 0.5000
Epoch 00209: val_loss did not improve from 7.71247
22/22 [==============================] - 6s 251ms/step - loss: 7.7125 - acc: 0.5000 - val_loss: 7.7125 - val_acc: 0.5000
Epoch 210/20000
21/22 [===========================>..] - eta: 0s - loss: 7.7125 - acc: 0.5000
Epoch 00210: val_loss did not improve from 7.71247
22/22 [==============================] - 5s 242ms/step - loss: 7.7125 - acc: 0.5000 - val_loss: 7.7125 - val_acc: 0.5000
Epoch 211/20000
21/22 [===========================>..] - eta: 0s - loss: 7.7125 - acc: 0.5000
Epoch 00211: val_loss did not improve from 7.71247
22/22 [==============================] - 5s 246ms/step - loss: 7.7125 - acc: 0.5000 - val_loss: 7.7125 - val_acc: 0.5000
Epoch 212/20000
21/22 [===========================>..] - eta: 0s - loss: 7.7125 - acc: 0.5000
Epoch 00212: val_loss did not improve from 7.71247
22/22 [==============================] - 6s 252ms/step - loss: 7.7125 - acc: 0.5000 - val_loss: 7.7125 - val_acc: 0.5000
Epoch 213/20000
21/22 [===========================>..] - eta: 0s - loss: 7.7125 - acc: 0.5000
Epoch 00213: val_loss did not improve from 7.71247
22/22 [==============================] - 5s 242ms/step - loss: 7.7125 - acc: 0.5000 - val_loss: 7.7125 - val_acc: 0.5000
Epoch 214/20000
21/22 [===========================>..] - eta: 0s - loss: 7.7125 - acc: 0.5000
Epoch 00214: val_loss did not improve from 7.71247
22/22 [==============================] - 5s 241ms/step - loss: 7.7125 - acc: 0.5000 - val_loss: 7.7125 - val_acc: 0.5000
Epoch 215/20000
21/22 [===========================>..] - eta: 0s - loss: 7.7125 - acc: 0.5000
Epoch 00215: val_loss did not improve from 7.71247
22/22 [==============================] - 5s 247ms/step - loss: 7.7125 - acc: 0.5000 - val_loss: 7.7125 - val_acc: 0.5000
Epoch 216/20000
21/22 [===========================>..] - eta: 0s - loss: 7.7125 - acc: 0.5000
Epoch 00216: val_loss did not improve from 7.71247
22/22 [==============================] - 5s 248ms/step - loss: 7.7125 - acc: 0.5000 - val_loss: 7.7125 - val_acc: 0.5000
Epoch 217/20000
21/22 [===========================>..] - eta: 0s - loss: 7.7125 - acc: 0.5000
Epoch 00217: val_loss did not improve from 7.71247
22/22 [==============================] - 5s 249ms/step - loss: 7.7125 - acc: 0.5000 - val_loss: 7.7125 - val_acc: 0.5000
Epoch 218/20000
21/22 [===========================>..] - eta: 0s - loss: 7.7125 - acc: 0.5000
Epoch 00218: val_loss did not improve from 7.71247
22/22 [==============================] - 5s 244ms/step - loss: 7.7125 - acc: 0.5000 - val_loss: 7.7125 - val_acc: 0.5000
Epoch 219/20000
19/22 [========================>.....] - eta: 0s - loss: 7.7125 - acc: 0.5000