我必须依次使用numpy数组的行,因此我需要循环执行。我尝试添加prange来提高速度,但最终陷入僵局。我很可能做错了什么,在这种情况下我可能会误解比赛条件的含义。如何解决问题或正确并行处理数组行?下面是IPython单元的可复制代码:
@Entity
@IdClass(ItemId.class)
public class Item {
@Id
private Long id;
@Id
private Long billNo;
private String particular;
private String hsnCode;
private Double quantity;
private String quantityUnit;
private Double rate;
private String rateUnit;
private Double price;
@CreatedDate
private Date createdDate;
@LastModifiedDate
private Date updatedDate;
//Getters and setters
}
%load_ext cython
import numpy as np
%%cython --compile-args=-fopenmp --link-args=-fopenmp --force
# if on Windows,replace line above with: %%cython --compile-args=/openmp --link-args=/openmp --force
cimport cython
from cython.parallel cimport prange
@cython.boundscheck(False)
@cython.wraparound(False)
cdef double sum_row(double [:] arr) nogil:
cdef int i
cdef int size = arr.shape[0]
cdef double s = 0
for i in range(size):
s += arr[i]
return s
@cython.boundscheck(False)
@cython.wraparound(False)
cpdef double sum_all_rows(double [:,:] arr) nogil:
cdef int i
cdef int n_rows = arr.shape[1] + 1
cdef double s = 0
for i in range(n_rows):
s += sum_row(arr[i])
return s
@cython.boundscheck(False)
@cython.wraparound(False)
cpdef double sum_all_rows_parallel(double [:,:] arr) nogil:
cdef int i
cdef int n_rows = arr.shape[1] + 1
cdef double s = 0
for i in prange(n_rows):
s += sum_row(arr[i])
return s
X = np.array([[3.14,2.71,0.002],[0.5,1,4.21],[0.001,0.002,0.003],[-0.1,-0.11,-0.12]])
每个循环754 ns±38.3 ns(平均±标准偏差,共运行7次,每个循环1000000次)
%timeit sum_all_rows(X)
它挂了...大约5分钟后我停了下来。我没有发现内存使用率和CPU使用率都有任何增加。
在看到预期的加速之前,我已经使用过prange的玩具示例,因此我能够正确地进行编译。
还有,建议阅读以更好地理解此类问题吗?
谢谢您的帮助。