library(xgboost)
data(agaricus.train,package='xgboost')
data(agaricus.test,package='xgboost')
# Initialize baseline predictions to be 1.5
baseline_predictions <- rep(1.5,nrow(agaricus.train$data))
# base_margin is the base prediction Xgboost will boost from ;
dtrain <- xgb.DMatrix(agaricus.train$data,label = agaricus.train$label,base_margin = baseline_predictions)
dtest <- xgb.DMatrix(agaricus.test$data)
param <- list(max_depth = 2,eta = 1,verbose = 0,nthread = 2,objective = "binary:logistic",eval_metric = "auc")
# Train model
bst <- xgb.train(param,dtrain,nrounds = 2)
#Predict on test set
predict(bst,newdata = dtest)
在上面的代码中,我训练了一个名为bst
的xgboost模型,该模型是从baseline_predictions
初始化的。然后,我使用predict
函数将模型拟合到测试集dtest
。
我的问题是,如何找出predict(bst,newdata = dtest)
中模型的来源?我了解我可以使用以下代码来提取基线值bst
,该基线值将从以下位置增加:
xgboost::getinfo(object = dtrain,name = "base_margin")
但是由于我没有在base_margin
中指定xgb.DMatrix(agaricus.test$data)
,因此运行以下代码将返回NULL
xgboost::getinfo(object = dtest,name = "base_margin")
那么predict(bst,newdata = dtest)
是从baseline_predictions
(对于所有观察而言是1.5)提升还是从其他方面提升?