我有一个要分类的文本数据框。但是我需要先进行过采样。请在下面找到示例数据:
df=[['I am going to class today','I am going to class today','I am not going to class today','I am not going to class today'],['Positive','Positive','Negative','Negative']]
df=pd.DataFrame(df)
df=df.transpose()
df.columns=['Features','Class']
df
Features Class
0 I am going to class today Positive
1 I am going to class today Positive
2 I am going to class today Positive
3 I am going to class today Positive
4 I am going to class today Positive
5 I am going to class today Positive
6 I am going to class today Positive
7 I am going to class today Positive
8 I am going to class today Positive
9 I am going to class today Positive
10 I am not going to class today Negative
11 I am not going to class today Negative
12 I am not going to class today Negative
13 I am not going to class today Negative
oversample = RandomOverSampler(sampling_strategy='minority')
# fit and apply the transform
X_over,y_over = oversample.fit_resample(df['Features'],df['Class'])
# summarize class distribution
print(Counter(y_over))
但这不起作用,并且给了我ValueError: Expected 2D array,got 1D array instead:
。如何对该数据进行超采样?