Python多项式回归绘图错误吗?

  

Blockquote

是python的新手,它尝试对某些数据完成三阶多项式回归。当我使用多项式回归时,我没有达到预期的拟合度。我试图理解为什么python中的多项式回归要比excel中的差。当我在excel中拟合相同的数据时,我得到的确定系数约为0.95,该图看起来像三阶多项式。但是,使用病态学习≈.78时,拟合度几乎呈线性。是否因为我没有足够的数据而发生这种情况?在x轴上使用x作为datetime64 [ns]类型是否还会影响回归?代码运行。但是,我不确定这是编码问题还是其他问题。

我正在使用anaconda(python 3.7)并在spyder中运行代码

import operator
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
from sklearn.linear_model import LinearRegression
from sklearn.preprocessing import PolynomialFeatures
#import data
data = pd.read_excel(r'D:\Anaconda\Anaconda\XData\data.xlsx',skiprows = 0)

x=np.c_[data['Date']]
y=np.c_[data['level']]
#regression
polynomial_features= PolynomialFeatures(degree=3)
x_poly = polynomial_features.fit_transform(x)

model = LinearRegression()
model.fit(x_poly,y)
y_poly_pred = model.predict(x_poly)
#check regression stats
rmse = np.sqrt(mean_squared_error(y,y_poly_pred))
r2 = r2_score(y,y_poly_pred)
print(rmse)
print(r2)

#plot
plt.scatter(x,y,s=10)

# sort the values of x b[![enter image description here][1]][1]efore line plot
sort_axis = operator.itemgetter(0)
sorted_zip = sorted(zip(x,y_poly_pred),key=sort_axis)
x,y_poly_pred = zip(*sorted_zip)
plt.plot(x,y_poly_pred,color='m')
plt.show()

Python多项式回归绘图错误吗?

Python多项式回归绘图错误吗?

comeonace 回答:Python多项式回归绘图错误吗?

问题是在x轴上使用datetime64[ns]类型。 an issue on github关于datetime64[ns]内部如何处理sklearn。问题是,在这种情况下,datetime64[ns]个要素将按比例缩放为10¹order的大小:

x_poly
Out[91]: 
array([[1.00000000e+00,1.29911040e+18,1.68768783e+36,2.19249281e+54],[1.00000000e+00,1.33617600e+18,1.78536630e+36,2.38556361e+54],1.39129920e+18,1.93571346e+36,2.69315659e+54],1.41566400e+18,2.00410456e+36,2.83713868e+54],1.43354880e+18,2.05506216e+36,2.94603190e+54],1.47061440e+18,2.16270671e+36,3.18050764e+54],1.49670720e+18,2.24013244e+36,3.35282236e+54],1.51476480e+18,2.29451240e+36,3.47564662e+54],1.57610880e+18,2.48411895e+36,3.91524174e+54]])

最简单的处理方法是使用StandardScaler或使用pd.to_numeric转换日期时间并缩放比例:

scaler = StandardScaler()
x_scaled = scaler.fit_transform(np.c_[data['Date']])

或者简单地

x_scaled = np.c_[pd.to_numeric(data['Date'])] / 10e17  # convert and scale

这提供了适当缩放的功能:

x_poly = polynomial_features.fit_transform(x_scaled)
x_poly
Out[94]: 
array([[1.,1.2991104,1.68768783,2.19249281],[1.,1.336176,1.7853663,2.38556361],1.3912992,1.93571346,2.69315659],1.415664,2.00410456,2.83713868],1.4335488,2.05506216,2.9460319 ],1.4706144,2.16270671,3.18050764],1.4967072,2.24013244,3.35282236],1.5147648,2.2945124,3.47564662],1.5761088,2.48411895,3.91524174]])

编辑:保留您的x进行绘图。要进行预测,您应该对要预测的特征应用相同的变换。之后的结果将如下所示:

x = np.c_[data['Date']]
x_scaled = np.c_[pd.to_numeric(data['Date'])] / 10e17  # convert and scale
polynomial_features = PolynomialFeatures(degree=3)
x_poly = polynomial_features.fit_transform(x_scaled)

model = LinearRegression()
model.fit(x_poly,y)
y_poly_pred = model.predict(x_poly)

# test to predict
s_test = pd.to_datetime(pd.Series(['1/1/2013','5/5/2019']))
x_test = np.c_[s_test]
x_poly_test = polynomial_features.transform(np.c_[pd.to_numeric(s_test)] / 10e17)
y_test_pred = model.predict(x_poly_test)

plt.scatter(x,y,s=10)
# plot predictions as red dots
plt.scatter(x_test,y_test_pred,s=10,c='red')
plt.plot(x,y_poly_pred,color='m')
plt.show()

enter image description here

本文链接:https://www.f2er.com/3000067.html

大家都在问