即使整个管道都未安装,管道中的Sklearn组件仍未安装?

我试图从装配好的管道中挑选出一个组件/变压器,以检查其行为。但是,当我检索该组件时,该组件显示为不适合,但使用整个管道可以正常工作。这表明管道已安装,部件也已安装。

有人可以解释原因,也可以建议如何检查装配好的管道中的组件吗?

这是一个可重复的示例:

import pandas as pd
import numpy as np

from sklearn.compose import ColumnTransformer
from sklearn.pipeline import Pipeline
from sklearn.impute import SimpleImputer
from sklearn.preprocessing import StandardScaler,OneHotEncoder
from sklearn.linear_model import LogisticRegression
from sklearn.model_selection import train_test_split,GridSearchCV

np.random.seed(0)

# Read data from Titanic dataset.
titanic_url = ('https://raw.githubusercontent.com/amueller/'
               'scipy-2017-sklearn/091d371/notebooks/datasets/titanic3.csv')
data = pd.read_csv(titanic_url)

# We create the preprocessing pipelines for both numeric and categorical data.
numeric_features = ['age','fare']
numeric_transformer = Pipeline(steps=[
    ('imputer',SimpleImputer(strategy='median')),('scaler',StandardScaler())])

categorical_features = ['embarked','sex','pclass']
categorical_transformer = Pipeline(steps=[
    ('imputer',SimpleImputer(strategy='constant',fill_value='missing')),('onehot',OneHotEncoder(handle_unknown='ignore'))])

preprocessor = ColumnTransformer(
    transformers=[
        ('num',numeric_transformer,numeric_features),('cat',categorical_transformer,categorical_features)])

# Append classifier to preprocessing pipeline.
# Now we have a full prediction pipeline.
clf = Pipeline(steps=[('preprocessor',preprocessor),('classifier',LogisticRegression(solver='lbfgs'))])

X = data.drop('survived',axis=1)
y = data['survived']

X_train,X_test,y_train,y_test = train_test_split(X,y,test_size=0.2)

clf.fit(X_train,y_train)
print("model score: %.3f" % clf.score(X_test,y_test))

致电:

clf.get_params()['preprocessor__cat__imputer'].transform(X)

clf.named_steps['preprocessor'].transformers[0][1].named_steps['imputer'].transform(X)

将导致此类错误:

NotFittedError: This SimpleImputer instance is not fitted yet. Call 'fit' with appropriate arguments before using this method.
wyq_1234 回答:即使整个管道都未安装,管道中的Sklearn组件仍未安装?

ColumnTransformer属性transformers是输入 unfitted 变压器。要访问已安装的变压器,请使用属性transformers_named_transformers_。我想get_params()['preprocessor__cat__imputer']也正在获得不合适的输入变压器。

(您仍然会收到错误消息,因为imputer也将尝试处理字符串数据,而strategy='median'也会失败。)

本文链接:https://www.f2er.com/3162375.html

大家都在问