我正在尝试构建一个用于二元分类的神经网络,不幸的是它总是预测值 0,即使训练集数据的五分之一是 1。我不知道为什么会这样。我的数据集看起来像这样,所以有几个分类变量和几个连续的,(目标是我们预测的那个):
这里可以下载数据: https://drive.google.com/drive/folders/1PsG2rRdbxyocyqvLSa7zSy_aVDMRJ2Ug?usp=sharing
你可以用
阅读它df = pd.read_csv("train.csv",index_col=0)
现在我正在为神经网络准备数据:
x_train=df.drop(labels=['target'],axis=1).values
y_train=df['target'].values
X_train,X_val,y_train,y_val = train_test_split(
x_train,test_size=0.2)
LR = 0.001
EPOCH = 50
BATCH_SIZE = 64
torch_X_train = torch.tensor(X_train)
torch_y_train = torch.tensor(y_train)
torch_X_val = torch.tensor(X_val)
torch_y_val = torch.tensor(y_val)
train = torch.utils.data.Tensordataset(torch_X_train,torch_y_train)
validate = torch.utils.data.Tensordataset(torch_X_val,torch_y_val)
train_loader = torch.utils.data.DataLoader(train,batch_size =
BATCH_SIZE,shuffle = True)
val_loader = torch.utils.data.DataLoader(validate,shuffle = False)
并定义一个只有线性层的简单网络:
class NN(nn.Module):
def __init__(self,input_size):
super().__init__()
self.to_class=nn.Sequential(
nn.Linear(input_size,512),nn.ReLU(),nn.Linear(512,256),nn.Linear(256,32),nn.Linear(32,2)
)
def forward(self,inputs):
pred= self.to_class(inputs)
return F.softmax(pred,dim=1)
最后我训练它
net=NN(7)
loss_func = nn.CrossEntropyLoss()
optimizer = optim.Adam(net.parameters(),lr=LR)
train_loss = np.zeros(EPOCH)
val_loss = np.zeros(EPOCH)
acc_train = []
acc_val=[]
for epoch in range(EPOCH):
correct = 0
total = 0
for data in train_loader:
X,y = data
optimizer.zero_grad()
output=net(X.float())
total += output.size(0)
pred = output.argmax(dim=1,keepdim=True)
correct += pred.eq(y.view_as(pred)).sum().item()
loss = loss_func(output.squeeze(),y)
loss.backward()
optimizer.step()
train_loss[epoch] += loss
train_loss[epoch] /= len(train_loader)
acc_train.append(correct/total*100)
print('epoch %d:\t train_accuracy %.5f\ttrain loss: %.5f'%(epoch,acc_train[epoch],train_loss[epoch]))
但是火车损失不会去任何地方,预测总是一类!有人可以解释这种现象并提示我如何改进它吗?