为什么我的用于检测图像旋转的卷积模型为每张图片预测相同的类别?

我希望我的模型使用自行生成的文本图片来检测角度(360个类别)。 为了获得更多信息进行训练,每个新的训练循环图片都会以新的随机旋转生成。 但是,该模型似乎不是在学习,因为它为每张图片预测相同的类。我尝试过更改批处理大小,优化器,学习率,更复杂的模型,但是没有任何解决方法。

在此示例中,我正在使用 500个训练样本,50个验证样本和10个测试样本。我已经尝试了多达2000个训练样本,但是出现了同样的问题。

这是我的输出:

Using TensorFlow backend.
WARNING:tensorflow:From /home/lisa/.local/lib/python3.7/site-packages/keras/backend/tensorflow_backend.py:4070: The name tf.nn.max_pool is deprecated. Please use tf.nn.max_pool2d instead.

Model: "model_1"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
input_1 (InputLayer)         (None,224,3)       0         
_________________________________________________________________
conv2d_1 (Conv2D)            (None,222,32)      896       
_________________________________________________________________
max_pooling2d_1 (MaxPooling2 (None,111,32)      0         
_________________________________________________________________
conv2d_2 (Conv2D)            (None,109,64)      18496     
_________________________________________________________________
max_pooling2d_2 (MaxPooling2 (None,54,64)        0         
_________________________________________________________________
conv2d_3 (Conv2D)            (None,52,128)       73856     
_________________________________________________________________
max_pooling2d_3 (MaxPooling2 (None,26,128)       0         
_________________________________________________________________
conv2d_4 (Conv2D)            (None,24,128)       147584    
_________________________________________________________________
max_pooling2d_4 (MaxPooling2 (None,12,128)       0         
_________________________________________________________________
flatten_1 (flatten)          (None,18432)             0         
_________________________________________________________________
dense_1 (Dense)              (None,512)               9437696   
_________________________________________________________________
dense_2 (Dense)              (None,360)               184680    
=================================================================
Total params: 9,863,208
Trainable params: 9,208
Non-trainable params: 0
_________________________________________________________________
2019-11-06 11:08:47.885295: I tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 fma
2019-11-06 11:08:47.901431: I tensorflow/core/platform/profile_utils/cpu_utils.cc:94] CPU Frequency: 3408000000 Hz
2019-11-06 11:08:47.902091: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x55f4487aac50 executing computations on platform Host. Devices:
2019-11-06 11:08:47.902139: I tensorflow/compiler/xla/service/service.cc:175]   StreamExecutor device (0): <undefined>,<undefined>
2019-11-06 11:08:47.903354: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcuda.so.1
2019-11-06 11:08:47.921001: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1005] successful NUMA node read from SysFS had negative value (-1),but there must be at least one NUMA node,so returning NUMA node zero
2019-11-06 11:08:47.921953: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1640] Found device 0 with properties: 
name: GeForce GTX 970 major: 5 minor: 2 memoryClockRate(GHz): 1.1775
pciBusID: 0000:01:00.0
2019-11-06 11:08:47.922112: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcudart.so.10.0
2019-11-06 11:08:47.922988: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcublas.so.10.0
2019-11-06 11:08:47.923739: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcufft.so.10.0
2019-11-06 11:08:47.923921: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcurand.so.10.0
2019-11-06 11:08:47.924921: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcusolver.so.10.0
2019-11-06 11:08:47.925684: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcusparse.so.10.0
2019-11-06 11:08:47.928111: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcudnn.so.7
2019-11-06 11:08:47.928199: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1005] successful NUMA node read from SysFS had negative value (-1),so returning NUMA node zero
2019-11-06 11:08:47.929103: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1005] successful NUMA node read from SysFS had negative value (-1),so returning NUMA node zero
2019-11-06 11:08:47.929818: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1763] Adding visible gpu devices: 0
2019-11-06 11:08:47.929844: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcudart.so.10.0
2019-11-06 11:08:47.976192: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1181] Device interconnect StreamExecutor with strength 1 edge matrix:
2019-11-06 11:08:47.976213: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1187]      0 
2019-11-06 11:08:47.976219: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1200] 0:   N 
2019-11-06 11:08:47.976372: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1005] successful NUMA node read from SysFS had negative value (-1),so returning NUMA node zero
2019-11-06 11:08:47.977217: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1005] successful NUMA node read from SysFS had negative value (-1),so returning NUMA node zero
2019-11-06 11:08:47.978039: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1005] successful NUMA node read from SysFS had negative value (-1),so returning NUMA node zero
2019-11-06 11:08:47.978851: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1326] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 3466 MB memory) -> physical GPU (device: 0,name: GeForce GTX 970,pci bus id: 0000:01:00.0,compute capability: 5.2)
2019-11-06 11:08:47.980313: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x55f449158000 executing computations on platform CUDA. Devices:
2019-11-06 11:08:47.980326: I tensorflow/compiler/xla/service/service.cc:175]   StreamExecutor device (0): GeForce GTX 970,Compute Capability 5.2
WARNING:tensorflow:From /home/lisa/.local/lib/python3.7/site-packages/keras/backend/tensorflow_backend.py:422: The name tf.global_variables is deprecated. Please use tf.compat.v1.global_variables instead.

Epoch 1/50
2019-11-06 11:08:48.922378: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcublas.so.10.0
2019-11-06 11:08:49.080712: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcudnn.so.7
16/16 [==============================] - 3s 199ms/step - loss: 10271548.3852 - mse_angle: 88.4758 - val_loss: 6.0310 - val_mse_angle: 83.5972
Epoch 2/50
16/16 [==============================] - 1s 84ms/step - loss: 6.0294 - mse_angle: 87.3988 - val_loss: 6.2498 - val_mse_angle: 90.8889
Epoch 3/50
16/16 [==============================] - 1s 82ms/step - loss: 6.9000 - mse_angle: 90.9215 - val_loss: 6.2606 - val_mse_angle: 96.1042
Epoch 4/50
16/16 [==============================] - 1s 82ms/step - loss: 6.0261 - mse_angle: 90.2238 - val_loss: 6.1281 - val_mse_angle: 89.1111
Epoch 5/50
16/16 [==============================] - 1s 82ms/step - loss: 6.0339 - mse_angle: 90.6246 - val_loss: 6.1609 - val_mse_angle: 84.5764
Epoch 6/50
16/16 [==============================] - 1s 82ms/step - loss: 5.9953 - mse_angle: 90.6105 - val_loss: 6.0373 - val_mse_angle: 97.3819
Epoch 7/50
16/16 [==============================] - 1s 82ms/step - loss: 5.9419 - mse_angle: 90.0617 - val_loss: 6.0082 - val_mse_angle: 99.2257
Epoch 8/50
16/16 [==============================] - 1s 82ms/step - loss: 5.9563 - mse_angle: 89.2258 - val_loss: 6.0243 - val_mse_angle: 99.2257
Epoch 9/50
16/16 [==============================] - 1s 83ms/step - loss: 5.9515 - mse_angle: 92.9902 - val_loss: 6.0726 - val_mse_angle: 87.7812
Epoch 10/50
16/16 [==============================] - 1s 82ms/step - loss: 5.9554 - mse_angle: 89.0434 - val_loss: 6.0980 - val_mse_angle: 81.9757
Epoch 11/50
16/16 [==============================] - 1s 82ms/step - loss: 5.9761 - mse_angle: 90.9699 - val_loss: 6.1573 - val_mse_angle: 99.1910
Epoch 12/50
16/16 [==============================] - 1s 83ms/step - loss: 5.9674 - mse_angle: 87.5254 - val_loss: 6.1502 - val_mse_angle: 91.5312
Epoch 13/50
16/16 [==============================] - 1s 82ms/step - loss: 5.9457 - mse_angle: 90.9098 - val_loss: 6.1447 - val_mse_angle: 89.7708
Epoch 14/50
16/16 [==============================] - 1s 82ms/step - loss: 5.9803 - mse_angle: 92.3281 - val_loss: 6.1520 - val_mse_angle: 97.5417
Epoch 15/50
16/16 [==============================] - 1s 82ms/step - loss: 5.9663 - mse_angle: 91.3766 - val_loss: 6.1332 - val_mse_angle: 81.1562
Epoch 16/50
16/16 [==============================] - 1s 82ms/step - loss: 5.9707 - mse_angle: 89.2891 - val_loss: 6.0442 - val_mse_angle: 88.7361
Epoch 17/50
16/16 [==============================] - 1s 82ms/step - loss: 5.9691 - mse_angle: 87.9980 - val_loss: 5.8971 - val_mse_angle: 81.1562
Epoch 18/50
16/16 [==============================] - 1s 82ms/step - loss: 5.9675 - mse_angle: 87.8605 - val_loss: 5.9070 - val_mse_angle: 81.1562
Epoch 19/50
16/16 [==============================] - 1s 81ms/step - loss: 5.9816 - mse_angle: 88.3820 - val_loss: 6.0384 - val_mse_angle: 90.0694
Epoch 20/50
16/16 [==============================] - 1s 82ms/step - loss: 6.0144 - mse_angle: 91.3855 - val_loss: 6.1066 - val_mse_angle: 90.0694
Epoch 21/50
16/16 [==============================] - 1s 82ms/step - loss: 5.9556 - mse_angle: 92.5727 - val_loss: 6.2307 - val_mse_angle: 86.2465
Epoch 22/50
16/16 [==============================] - 1s 83ms/step - loss: 5.9522 - mse_angle: 90.1418 - val_loss: 6.1750 - val_mse_angle: 81.9062
Epoch 23/50
16/16 [==============================] - 1s 81ms/step - loss: 5.9603 - mse_angle: 88.3703 - val_loss: 6.0286 - val_mse_angle: 81.9062
Epoch 24/50
16/16 [==============================] - 1s 82ms/step - loss: 5.9608 - mse_angle: 90.1531 - val_loss: 5.9816 - val_mse_angle: 97.9549
Epoch 25/50
16/16 [==============================] - 1s 82ms/step - loss: 5.9764 - mse_angle: 88.8660 - val_loss: 6.0606 - val_mse_angle: 89.0174
Epoch 26/50
16/16 [==============================] - 1s 82ms/step - loss: 5.9771 - mse_angle: 90.2336 - val_loss: 6.0759 - val_mse_angle: 83.8507
Epoch 27/50
16/16 [==============================] - 1s 82ms/step - loss: 6.0073 - mse_angle: 90.3863 - val_loss: 6.0298 - val_mse_angle: 83.8507
Epoch 28/50
16/16 [==============================] - 1s 82ms/step - loss: 5.9664 - mse_angle: 89.0832 - val_loss: 5.9718 - val_mse_angle: 83.5972
Epoch 29/50
16/16 [==============================] - 1s 82ms/step - loss: 5.9445 - mse_angle: 88.3340 - val_loss: 5.9844 - val_mse_angle: 82.4306
Epoch 30/50
16/16 [==============================] - 1s 81ms/step - loss: 5.9596 - mse_angle: 90.2934 - val_loss: 5.8805 - val_mse_angle: 83.0521
Epoch 31/50
16/16 [==============================] - 1s 83ms/step - loss: 5.9729 - mse_angle: 91.9238 - val_loss: 5.9500 - val_mse_angle: 84.4444
Epoch 32/50
16/16 [==============================] - 1s 83ms/step - loss: 5.9743 - mse_angle: 90.0250 - val_loss: 6.0221 - val_mse_angle: 97.5556
Epoch 33/50
16/16 [==============================] - 1s 81ms/step - loss: 5.9469 - mse_angle: 86.5922 - val_loss: 6.0201 - val_mse_angle: 87.6076
Epoch 34/50
16/16 [==============================] - 1s 82ms/step - loss: 5.9822 - mse_angle: 93.8836 - val_loss: 5.9119 - val_mse_angle: 81.3472
Epoch 35/50
16/16 [==============================] - 1s 81ms/step - loss: 5.9751 - mse_angle: 88.9707 - val_loss: 5.9052 - val_mse_angle: 99.3993
Epoch 36/50
16/16 [==============================] - 1s 83ms/step - loss: 5.9564 - mse_angle: 89.6219 - val_loss: 5.9162 - val_mse_angle: 92.5278
Epoch 37/50
16/16 [==============================] - 1s 82ms/step - loss: 5.9864 - mse_angle: 94.1816 - val_loss: 5.9559 - val_mse_angle: 90.5278
Epoch 38/50
16/16 [==============================] - 1s 82ms/step - loss: 5.9566 - mse_angle: 88.3102 - val_loss: 6.0087 - val_mse_angle: 99.3993
Epoch 39/50
16/16 [==============================] - 1s 83ms/step - loss: 5.9639 - mse_angle: 91.0492 - val_loss: 5.9907 - val_mse_angle: 94.2361
Epoch 40/50
16/16 [==============================] - 1s 83ms/step - loss: 5.9792 - mse_angle: 88.0059 - val_loss: 5.8827 - val_mse_angle: 94.3056
Epoch 41/50
16/16 [==============================] - 1s 82ms/step - loss: 5.9297 - mse_angle: 92.0566 - val_loss: 5.8013 - val_mse_angle: 94.6319
Epoch 42/50
16/16 [==============================] - 1s 84ms/step - loss: 5.9666 - mse_angle: 88.4168 - val_loss: 5.8768 - val_mse_angle: 99.4826
Epoch 43/50
16/16 [==============================] - 1s 82ms/step - loss: 5.9887 - mse_angle: 90.3191 - val_loss: 5.9197 - val_mse_angle: 96.8611
Epoch 44/50
16/16 [==============================] - 1s 82ms/step - loss: 5.9889 - mse_angle: 87.8867 - val_loss: 5.8738 - val_mse_angle: 96.6875
Epoch 45/50
16/16 [==============================] - 1s 83ms/step - loss: 5.9694 - mse_angle: 92.4437 - val_loss: 5.8639 - val_mse_angle: 98.7222
Epoch 46/50
16/16 [==============================] - 1s 82ms/step - loss: 5.9560 - mse_angle: 89.9125 - val_loss: 5.8387 - val_mse_angle: 82.4965
Epoch 47/50
16/16 [==============================] - 1s 82ms/step - loss: 5.9468 - mse_angle: 89.7066 - val_loss: 5.9525 - val_mse_angle: 87.1632
Epoch 48/50
16/16 [==============================] - 1s 83ms/step - loss: 6.0111 - mse_angle: 89.5977 - val_loss: 5.9091 - val_mse_angle: 96.6875
Epoch 49/50
16/16 [==============================] - 1s 82ms/step - loss: 5.9648 - mse_angle: 89.0430 - val_loss: 5.9656 - val_mse_angle: 92.8368
Epoch 50/50
16/16 [==============================] - 1s 82ms/step - loss: 5.9234 - mse_angle: 91.1891 - val_loss: 5.9717 - val_mse_angle: 99.2257
for image 0 angle: 312,pred: 46
for image 1 angle: 202,pred: 46
for image 2 angle: 235,pred: 46
for image 3 angle: 286,pred: 46
for image 4 angle: 226,pred: 46
for image 5 angle: 76,pred: 46
for image 6 angle: 91,pred: 46
for image 7 angle: 91,pred: 46
for image 8 angle: 97,pred: 46
for image 9 angle: 263,pred: 46

这是我的模型。py:

import numpy as np
from keras import backend as K
from keras.layers.convolutional import Conv2D,MaxPooling2D
from keras.layers import Input,Dense,flatten
from keras.models import Model
from keras.optimizers import Adam
from keras.preprocessing import image as keras_image
from keras.utils import Sequence
from keras.utils.np_utils import to_categorical
from PIL import Image
import math
from random import randint
import os
from numpy import argmax
from create_text_images import create_data

def get_dataset(directory,name):
    """
    Resize the pictures in the directory and return as a numpy array.
    """
    X_train = []
    for i,img_name in enumerate(os.listdir(directory)):
        img_path = os.path.join(directory,img_name)
        with Image.open(img_path) as img:
           img = img.resize((262,262))
           x = keras_image.img_to_array(img)
        X_train.append(x)
    X_train = np.array(X_train)
    return X_train

def rotate_pictures(X_images):
    """
    Randomly rotate the picture,then crop it to size 224x224.
    Return the image as x normalized /255
    and the rotation (converted to 360 categories) as y.
    """
    X_train,y_train = [],[]
    for i,img in enumerate(X_images):
        img = keras_image.array_to_img(img)
        rotation = randint(0,359)
        img = img.rotate(rotation,resample=Image.BICUBIC)
        w,h = img.size
        img = img.crop(((w//2 - 112),(h//2 - 112),(w//2 + 112),(h//2 + 112)))
        x = keras_image.img_to_array(img)/255.0
        X_train.append(x)
        y_train.append(rotation)
    y_train = to_categorical(y_train,num_classes=360)
    X_train = np.array(X_train)
    y_train = np.array(y_train)
    return X_train,y_train

class data_generator(Sequence):
    """
    On initiation,create x and y data with the rotated pictures and their rotation.
    If the dataset is 'train',then rotate original pictures again after every epoch.
    """
    def __init__(self,images,name,batch_size):
        self.images = images
        self.name = name
        self.x,self.y = rotate_pictures(self.images)
        self.batch_size = batch_size
        self.indices = np.arange(self.x.shape[0])
        self.on_epoch_end()
    def __len__(self):
        return math.ceil(self.x.shape[0] / self.batch_size)
    def __getitem__(self,idx):
        inds = self.indices[idx * self.batch_size:(idx + 1) * self.batch_size]
        batch_x = self.x[inds]
        batch_y = self.y[inds]
        return batch_x,batch_y
    def on_epoch_end(self):
        if self.name == "train":
            self.x,self.y = rotate_pictures(self.images)

def mse_angle(y_true,y_pred):
    """
    Calculate the mean difference between the true angles
    and the predicted angles. Each angle is represented
    as a binary vector.
    """
    a = K.argmax(y_true)
    b = K.argmax(y_pred)
    diff = 180 - abs(abs(a - b) - 180)
    return K.mean(K.cast(K.abs(diff),K.floatx()))

train_dir = "train/"
val_dir = "val/"
test_dir = "test/"
number_of_epochs = 50
number_of_classes = 360
input_shape = (224,3)
activation_fn = 'softmax'
batch_size = 32

create_data(train_dir,500)
X_train = get_dataset(train_dir,"train")
train_generator = data_generator(X_train,"train",batch_size)
create_data(val_dir,50)
X_val = get_dataset(val_dir,"val")
val_generator = data_generator(X_val,"val",batch_size)
create_data(test_dir,10)
X_test = get_dataset(test_dir,"test")
X_test,y_test = rotate_pictures(X_test)

input_tensor = Input(shape=input_shape)
x = Conv2D(32,(3,3),activation='relu')(input_tensor)
x = MaxPooling2D((2,2),strides=(2,2))(x)
x = Conv2D(64,activation='relu')(x)
x = MaxPooling2D((2,2))(x)
x = Conv2D(128,2))(x)
x = flatten()(x)
x = Dense(512,activation='relu')(x)
output_tensor = Dense(number_of_classes,activation=activation_fn)(x)
model = Model(input_tensor,output_tensor)
model.summary()

model.compile(
        loss='categorical_crossentropy',optimizer=Adam(lr=0.1),metrics=[mse_angle]
        )

history = model.fit(
        train_generator,epochs=number_of_epochs,validation_data=val_generator
        )
model.save_weights('model_weights.h5')

predictions = model.predict(X_test)

for i,prediction in enumerate(predictions):
    angle = argmax(y_test[i])
    pred = argmax(prediction)
    print("for image {0} angle: {1},pred: {2}".format(i,angle,pred))

要运行代码,需要将其放置在具有三个空文件夹(val,train,test)和create_test_images.py的目录中:

import random
import string
from PIL import Image,ImageDraw,ImageFont

def get_random_string(stringLength):
    characters = 10*string.ascii_letters + 100*' ' + string.punctuation*2 + string.digits
    return ''.join(random.choice(characters) for i in range(stringLength))

def get_random_text(lines_min,lines_max,char_min,char_max,newline_min,newline_max):
    lines = ''
    for line in range(random.randint(lines_min,lines_max+1)):
        lines += get_random_string(random.randint( char_min,char_max+1))
        lines += '\n' * random.randint(newline_min,newline_max+1)
    return lines

def create_random_image(directory,file_name,paragraphs_min,paragraphs_max,fontsize_min,fontsize_max,lines_min,newline_max):
    img = Image.new('RGB',(876,876),color = 'white')
    img.alpha_channel = False
    d = ImageDraw.Draw(img)
    for i in range(random.randint(paragraphs_min,paragraphs_max+1)):
        fnt = ImageFont.truetype('Roboto-Black.ttf',random.randint(fontsize_min,fontsize_max+1))
        d.text((50,100+random.uniform(300,500)*i),get_random_text(lines_min,newline_max),fill='black',font=fnt) 
    img.save('{0}/{1}.png'.format(directory,file_name))

def create_data(directory,count):
    for i in range(0,count):
        create_random_image(directory,i,3,6,30,70,1,10,100,3)

非常感谢您提供任何提示!

编辑:删除了两行未使用的代码

amico1969 回答:为什么我的用于检测图像旋转的卷积模型为每张图片预测相同的类别?

正如我的评论中所述,使用您提供的代码,我可以重现您的问题,并将其从具有260个类的分类问题改写为回归问题。

因此,我通过S型激活将输出神经元的数量更改为仅一个,将标签更改为连续的数字并将其标准化(除以360),使其具有介于0和1之间的数字,将损失函数更改为MSE并使用了默认值用于优化程序。

经过这些修改,我经过10个训练周期才得到以下结果:

const data = {
  '0.Title': 'Title 1','0.Detail': 'Detail 1','1.Title': 'Title 2','1.Detail': 'Detail 2','2.Title': 'Title 1','2.Detail': 'Detail 1'
}


console.log(Object.keys(data).reduce((acc,curr) => {
  if(curr.indexOf(".Title") !== -1) acc += 1;
  return acc;
},0))
本文链接:https://www.f2er.com/3152641.html

大家都在问