在进行预测时，即使用于训练的数据集中的图像也给出相反的值

2024-04-29 • 问答

我是ML和TensorFlow的新手。我正在尝试构建一个cnn，以便对损坏的图像进行分类，类似于张量流中的剪刀石头布教程，但只有两个类别。

模型体系结构

train_generator = training_datagen.flow_from_directory(
    TRAINING_DIR,target_size=(150,150),class_mode='categorical'
)

validation_generator = validation_datagen.flow_from_directory(
    VALIDATION_DIR,class_mode='categorical'
)

model = tf.keras.models.Sequential([
    # Note the input shape is the desired size of the image 150x150 with 3 bytes color
    # This is the first convolution
    tf.keras.layers.Conv2D(64,(3,3),activation='relu',input_shape=(150,150,3)),tf.keras.layers.MaxPooling2D(2,2),# The second convolution
    tf.keras.layers.Conv2D(64,activation='relu'),# The third convolution
    tf.keras.layers.Conv2D(128,# The fourth convolution
    tf.keras.layers.Conv2D(128,# flatten the results to feed into a DNN
    tf.keras.layers.flatten(),tf.keras.layers.Dropout(0.5),# 512 neuron hidden layer
    tf.keras.layers.Dense(512,tf.keras.layers.Dense(2,activation='softmax')
])


model.summary()

model.compile(loss = 'categorical_crossentropy',optimizer='rmsprop',metrics=['accuracy'])

history = model.fit_generator(train_generator,epochs=25,validation_data = validation_generator,verbose = 1)

model.save("rps.h5")

我所做的唯一更改是将输入形状更改为（150,1）更改为（150,3），并将最后一层的输出从3更改为 2个神经元。培训使我在每个课程中 600张图像的数据集始终保持 90以上的准确性。但是，当我在本教程中使用代码进行预测时，即使对于数据集中的数据，它也会给我带来非常错误的值。

预测

TensorFlow教程中的原始代码

for file in onlyfiles:
  path = fn
  img = image.load_img(path,3)) # changed target_size to (150,3)) from (150,150 )
  x = image.img_to_array(img)
  x = np.expand_dims(x,axis=0)

  images = np.vstack([x])
  classes = model.predict(images,batch_size=10)
  print(fn)
  print(classes)

我相信target_size从（150,150）更改为（150，150,3）），因为我的输入是3通道图像，所以

结果

它甚至为数据集中的图像给出了非常错误的[0,1] [0,1]值

但是当我将代码更改为此

 for file in onlyfiles:
  path = fn
  img = image.load_img(path,3))
  x = image.img_to_array(img)
  x = np.expand_dims(x,axis=0)
  x /= 255.   
  classes = model.predict(images,batch_size=10)
  print(fn)
  print(classes)

在这种情况下，值类似于

    [[9.9999774e-01 2.2242968e-06]]
    [[9.9999785e-01 2.1864464e-06]]
    [[9.9999785e-01 2.1641024e-06]]

一两个错误，但是非常正确

所以我的问题是，即使最后一次激活是softmax，为什么现在以十进制值显示，我进行预测的方式是否存在逻辑错误？我也尝试过二进制文件，但没有太大区别。

Softmax返回作为输入的向量的概率分布。因此，获取十进制值的事实不是问题。如果要查找每个图像所属的确切类别，请尝试在预测中使用argmax函数。

请注意-

将输出类别从2更改为3时，要求模型将其分类为3个类别。这将与您的问题陈述相矛盾，该问题陈述将好的和坏的陈述分开，即2个输出类（二进制问题）。如果我正确理解了这个问题，我认为可以将其从3反转为2。
第二个您得到的输出是完全正确的，神经网络模型输出概率而不是绝对类值（例如0或1）。通过概率，它表明了它属于0类或1类的可能性。
此外，如@BBloggsbott所述，您只需要在输出数组上使用np.argmax即可告诉您默认情况下属于1类（正类）的可能性。希望这可以帮助。谢谢。

在进行预测时，即使用于训练的数据集中的图像也给出相反的值

模型体系结构

预测

结果

ljtyhuang 回答：在进行预测时，即使用于训练的数据集中的图像也给出相反的值

大家都在问