在python中对文本分类进行过采样?

我有一个要分类的文本数据框。但是我需要先进行过采样。请在下面找到示例数据:

df=[['I am going to class today','I am going to class today','I am not going to class today','I am not going to class today'],['Positive','Positive','Negative','Negative']]
df=pd.DataFrame(df)
df=df.transpose()
df.columns=['Features','Class']
df
          Features                       Class
0   I am going to class today       Positive
1   I am going to class today       Positive
2   I am going to class today       Positive
3   I am going to class today       Positive
4   I am going to class today       Positive
5   I am going to class today       Positive
6   I am going to class today       Positive
7   I am going to class today       Positive
8   I am going to class today       Positive
9   I am going to class today       Positive
10  I am not going to class today   Negative
11  I am not going to class today   Negative
12  I am not going to class today   Negative
13  I am not going to class today   Negative

oversample = RandomOverSampler(sampling_strategy='minority')
# fit and apply the transform
X_over,y_over = oversample.fit_resample(df['Features'],df['Class'])
# summarize class distribution
print(Counter(y_over))

但这不起作用,并且给了我ValueError: Expected 2D array,got 1D array instead:。如何对该数据进行超采样?

iCMS 回答:在python中对文本分类进行过采样?

我发现了问题。我需要重塑数据。

X_over,y_over = oversample.fit_resample(df['Features'].values.reshape(-1,1),df['Class'])

现在正在工作。

Counter({'Positive': 10,'Negative': 10})
本文链接:https://www.f2er.com/2240564.html

大家都在问