对于一个项目,我需要修改sklearn
源代码的一部分,并且使用cython
库遇到了一些问题。我有一个C结构定义为:
cdef struct split_info_struct:
# Same as the SplitInfo class,but we need a C struct to use it in the
# nogil sections and to use in arrays.
Y_DTYPE_C gain
int feature_idx
unsigned int bin_idx
unsigned char missing_go_to_left
Y_DTYPE_C sum_gradient_left
Y_DTYPE_C sum_gradient_right
Y_DTYPE_C sum_hessian_left
Y_DTYPE_C sum_hessian_right
unsigned int n_samples_left
unsigned int n_samples_right
Y_DTYPE_C* test_list
({Y_DTYPE_C
在另一个.pxd
脚本中定义为ctypedef np.npy_float64 Y_DTYPE_C
),而Pyhton类的定义为:
class SplitInfo:
"""Pure data class to store information about a potential split."""
def __init__(self,gain,feature_idx,bin_idx,missing_go_to_left,sum_gradient_left,sum_hessian_left,sum_gradient_right,sum_hessian_right,n_samples_left,n_samples_right,test_list=np.zeros(shape=(44))):
self.gain = gain
self.feature_idx = feature_idx
self.bin_idx = bin_idx
self.missing_go_to_left = missing_go_to_left
self.sum_gradient_left = sum_gradient_left
self.sum_hessian_left = sum_hessian_left
self.sum_gradient_right = sum_gradient_right
self.sum_hessian_right = sum_hessian_right
self.n_samples_left = n_samples_left
self.n_samples_right = n_samples_right
self.test_list = test_list # numpy array or Python list
完成一些计算之后(使用prange
和GIL
),我得到了split_info_struct
(C),我想将其转换为SplitInfo
(Python)。如果我省略属性test_list
,则可以使用以下代码进行转换:
split_infos_ = [x for x in split_infos[:n_features]]
其中split_infos
是包含split_info_struct
的数组。但是,如果我离开test_list
,则会出现以下错误:
sklearn/ensemble/_hist_gradient_boosting/splitting.pyx:450:14: Cannot convert 'split_info_struct *' to Python object
Traceback (most recent call last):
我应该如何解决这个问题?我知道问题类似于Return a struct from C to Python using Cython,但是结构仅由简单类型组成,而test_list此处是numpy
数组或列表(我怀疑这是一个问题,如附件链接中第二个答案所指出的。