formula='Survived~C(Pclass)+C(Sex)+Age+ SibSp'
train_titanic = titanic.iloc[0:600,:]
test_titanic= titanic.iloc[600:,:]
y_train,x_train =dmatrices(formula,data=train_titanic,return_type='dataframe')
y_test,x_test= dmatrices(formula,data=test_titanic,return_type='dataframe')
model = sm.logit(y_train,x_train)
res = model.fit()
print(res.summary()
上面是给模型的代码缺少必需的结果变量错误,如下所示:
回溯(最近通话最近): 文件“ C:\ Users \ OWNER \ AppData \ Local \ Programs \ Python \ Python37-32 \ titanicmodel.py”,第36行,在 模型= sm.logit(y_train,x_train) 文件“ C:\ Users \ OWNER \ AppData \ Local \ Programs \ Python \ Python37-32 \ lib \ site-packages \ statsmodels \ base \ model.py”,行159,位于from_formula中 丢失=丢失) 在handle_formula_data中的第65行,文件“ C:\ Users \ OWNER \ AppData \ Local \ Programs \ Python \ Python37-32 \ lib \ site-packages \ statsmodels \ formula \ formulatools.py” NA_action = na_action) 文件“ C:\ Users \ OWNER \ AppData \ Local \ Programs \ Python \ Python37-32 \ lib \ site-packages \ patsy \ highlevel.py”,行312,以dmatrices为单位 引发PatsyError(“模型缺少必需的结果变量”) patsy.PatsyError:模型缺少必需的结果变量