为什么我们在train_test_split的两个数组中都包含目标类？

2024-05-19 • 问答

X_train,test_df,y_train,y_test = train_test_split(result,y_true,stratify = y_true,test_size = 0.2)

在上面使用train_test_split的示例中，result是数据帧，y_true是从数据帧的目标类列形成的numpy数组。

我的问题是，如果我们已经分别给出“ y_true”，为什么我们将整个“结果”数据帧作为train_test_split中的输入参数之一？我的意思是，我们不应该首先从“结果”数据框中排除目标类列吗？

xzh16 回答：为什么我们在train_test_split的两个数组中都包含目标类？

Scikit-learn具有熊猫支持，但不是必需的。使用numpy数组时，将功能和标签都放在同一个数组中并不总是很有意义，因此train_test_split函数的当前设计。因此，由您来确保您的result DataFrame及其拆分具有所需的格式。如果y_true是result DataFrame的一部分，则可以（并且应该）选择在函数调用之前或之后排除它。

machine-learning scikit-learn train-test-split

本文链接：https://www.f2er.com/2843621.html

为什么我们在train_test_split的两个数组中都包含目标类？

xzh16 回答：为什么我们在train_test_split的两个数组中都包含目标类？

大家都在问