用于 One-Hot 编码的 Keras 自定义损失

我目前有一个经过训练的 DNN,可以对游戏所处的状态进行单热编码分类的预测。基本上,假设有三种状态,0,1,or 2.

现在,我通常会使用 categorical_cross_entropy 作为损失函数,但我意识到并非所有分类对于我的状态都是平等的。例如:

  • 如果模型预测它应该是状态 1,那么如果分类错误,我的系统就不会产生任何成本,因为状态 1 基本上是什么都不做,所以奖励 0x。
  • 如果模型正确预测状态 0 或 2(即预测 = 2 且正确 = 2),则该奖励应为 3 倍。
  • 如果模型错误地预测状态 0 或 2(即预测 = 2 且正确 = 0),则该奖励应为 -1x。

我知道我们可以在 Keras 中声明我们的自定义损失函数,但我一直坚持要形成它。有人对如何转换该伪代码有建议吗?我不知道如何在向量操作中做到这一点。

附加问题:我认为我基本上是在追求奖励功能。这和损失函数一样吗?谢谢!

def custom_expectancy(y_expected,y_pred):
    
    # Get 0,1 or 2
    expected_norm = tf.argmax(y_expected);
    predicted_norm = tf.argmax(y_pred);
    
    # Some pseudo code....
    # Now,if predicted == 1
    #     loss += 0
    # elif predicted == expected
    #     loss -= 3
    # elif predicted != expected
    #     loss += 1
    #
    # return loss

参考来源:

https://datascience.stackexchange.com/questions/55215/how-do-i-create-a-keras-custom-loss-function-for-a-one-hot-encoded-binary-classi

Custom loss in Keras with softmax to one-hot

代码更新

import tensorflow as tf
def custom_expectancy(y_expected,1 or 2
    expected_norm = tf.argmax(y_expected);
    predicted_norm = tf.argmax(y_pred);
    
    results = tf.unstack(expected_norm)
    
    # Some pseudo code....
    # Now,if predicted == 1
    #     loss += 0
    # elif predicted == expected
    #     loss += 3
    # elif predicted != expected
    #     loss -= 1
    
    for idx in range(0,len(expected_norm)):
        predicted = predicted_norm[idx]
        expected = expected_norm[idx]
        
        if predicted == 1: # do nothing
            results[idx] = 0.0
        elif predicted == expected: # reward
            results[idx] = 3.0
        else: # wrong,so we lost
            results[idx] = -1.0
    
    
    return tf.stack(results)

认为这就是我所追求的,但我还没有完全弄清楚如何构建正确的张量(应该是批量大小)以返回。

spy387824256 回答:用于 One-Hot 编码的 Keras 自定义损失

构建条件自定义损失的最佳方法是使用 curl http://config-server-host:8888/my-application/dev/logback.xml?useDefaultLabel 而不涉及循环。

在您的情况下,您应该组合 2 个 switch 条件表达式以获得所需的结果。

可以通过这种方式重现所需的损失函数:

tf.keras.backend.switch

其中 def custom_expectancy(y_expected,y_pred): zeros = tf.cast(tf.reduce_sum(y_pred*0,axis=-1),tf.float32) ### important to produce gradient y_expected = tf.cast(tf.reshape(y_expected,(-1,)),tf.float32) class_pred = tf.argmax(y_pred,axis=-1) class_pred = tf.cast(class_pred,tf.float32) cond1 = (class_pred != y_expected) & (class_pred != 1) cond2 = (class_pred == y_expected) & (class_pred != 1) res1 = tf.keras.backend.switch(cond1,zeros -1,zeros) res2 = tf.keras.backend.switch(cond2,zeros +3,zeros) return res1 + res2 是模型错误预测状态 0 或 2 时,cond1 是模型正确预测状态 0 或 2 时。标准状态为零,当 cond2 时返回和 cond1 未激活。

您可以注意到,cond2 可以作为一个简单的张量/整数编码状态数组传递(无需对它们进行一次性处理)。

损失函数的工作原理如下:

y_expected

哪个返回:

true = tf.constant([[1],[2],[1],[0]    ])  ## no need to one-hot
pred = tf.constant([[0,1,0],[0,1],0]])

custom_expectancy(true,pred)

这似乎符合我们的需求。

在模型中使用损失:

<tf.Tensor: shape=(4,),dtype=float32,numpy=array([ 0.,3.,-1.,0.],dtype=float32)>

Here 跑步笔记本

,

Here there is a nice post explaining the concepts of the loss function and cost function。多个答案说明了机器学习领域的不同作者如何看待它们。

至于损失函数,你可能会发现the following implementation useful。它实现了加权交叉熵损失,您可以根据训练中的权重按比例对每个类进行权重。这可以进行调整以满足上面指定的约束。

,

这是您想要的方式。如果您的基本事实 y_true 是密集的(形状为 N3),您可以使用 tf.reduce_all(y_true == [0.0,0.0,1.0],axis=-1,keepdims=True)tf.reduce_all(y_true == [1.0,0.0],keepdims=True) 来控制 if/elif/else。您可以使用 tf.gather 进一步优化它。

def sparse_loss(y_true,y_pred):
  """Calculate loss for game. Follows keras loss signature.
  
  Args:
    y_true: Sparse tensor of shape N1,where correct prediction
      is encoded as 0,or 2. 
    y_pred: Tensor of shape N3. For each row,the three columns
      represent the predicted probability of each state. 
      For example,[0.1,0.4,0.6] means,"There's a 10% chance the 
      right state is 0; 40% chance the right state is 1,and 60% chance the right state is 2". 
  """

  # This is the unvectorized implementation on individual rows which is more
  # intuitive. But TF requires vectorization. 
  # if y_true == 0:
  #   # Value matrix is shape 3. Broadcasting will occur. 
  #   return -tf.reduce_sum(y_pred * [3.0,-1.0])
  # elif y_true == 2:
  #   return -tf.reduce_sum(y_pred * [-1.0,3.0])
  # else:
  #   # According to the rules,this is never the correct
  #   # state the predict so it should never show up.
  #   assert False,f'Impossible state reached. y_true: {y_true},y_pred: {y_pred}.'


  # We vectorize by calculating the reward for all predictions for two cases:
  # if y_true is zero or if y_true is two. To eliminate this inefficiency,we 
  # could us tf.gather to build an N3 shaped matrix to multiply against. 
  reward_for_true_zero = tf.reduce_sum(y_pred * [3.0,-1.0],keepdims=True) # N1
  reward_for_true_two = tf.reduce_sum(y_pred * [-1.0,3.0],keepdims=True) # N1

  reward = tf.where(y_true == 0.0,reward_for_true_zero,reward_for_true_one) # N1
  return -tf.reduce_sum(reward)

本文链接:https://www.f2er.com/261.html

大家都在问