Ruby有一个很好的openCL包装器吗?

前端之家收集整理的这篇文章主要介绍了Ruby有一个很好的openCL包装器吗?前端之家小编觉得挺不错的,现在分享给大家,也给大家做个参考。
我知道:

https://github.com/lsegal/barracuda

自01/11以来尚未更新

http://rubyforge.org/projects/ruby-opencl/

自03/10以来尚未更新.

这些项目是否死亡?或者他们根本没有改变,因为它们的功能,OpenCL / Ruby从那以后没有改变.有人使用这些项目吗?运气好的话?

如果没有,你能否推荐另一个用于Ruby的opencl宝石?或者这样的通话是如何进行的?只需从Ruby调用raw C?

谢谢!

解决方法

你可以试试 opencl_ruby_ffi,它是积极开发的(由我的一个同事),并且与OpenCL 1.2版一起工作. OpenCL 2.0也应该即将推出.
  1. sudo gem install opencl_ruby_ffi

In Khronos forum你可以找到一个快速的例子来显示它的工作原理:

  1. require 'opencl_ruby_ffi'
  2.  
  3. # select the first platform/device available
  4. # improve it if you have multiple GPU on your machine
  5. platform = OpenCL::platforms.first
  6. device = platform.devices.first
  7.  
  8. # prepare the source of GPU kernel
  9. # this is not Ruby but OpenCL C
  10. source = <<EOF
  11. __kernel void addition( float2 alpha,__global const float *x,__global float *y) {\n\
  12. size_t ig = get_global_id(0);\n\
  13. y[ig] = (alpha.s0 + alpha.s1 + x[ig])*0.3333333333333333333f;\n\
  14. }
  15. EOF
  16.  
  17. # configure OpenCL environment,refer to OCL API if necessary
  18. context = OpenCL::create_context(device)
  19. queue = context.create_command_queue(device,:properties => OpenCL::CommandQueue::PROFILING_ENABLE)
  20.  
  21. # create and compile the OpenCL C source code
  22. prog = context.create_program_with_source(source)
  23. prog.build
  24.  
  25. # allocate cpu (=RAM) buffers and
  26. # fill the input one with random values
  27. a_in = NArray.sfloat(65536).random(1.0)
  28. a_out = NArray.sfloat(65536)
  29.  
  30. # allocate GPU buffers matching the cpu ones
  31. b_in = context.create_buffer(a_in.size * a_in.element_size,:flags => OpenCL::Mem::COPY_HOST_PTR,:host_ptr => a_in)
  32. b_out = context.create_buffer(a_out.size * a_out.element_size)
  33.  
  34. # create a constant pair of float
  35. f = OpenCL::Float2::new(3.0,2.0)
  36.  
  37. # trigger the execution of kernel 'addition' on 128 cores
  38. event = prog.addition(queue,[65536],f,b_in,b_out,:local_work_size => [128])
  39. # #Or if you want to be more OpenCL like:
  40. # k = prog.create_kernel("addition")
  41. # k.set_arg(0,f)
  42. # k.set_arg(1,b_in)
  43. # k.set_arg(2,b_out)
  44. # event = queue.enqueue_NDrange_kernel(k,:local_work_size => [128])
  45.  
  46. # tell OCL to transfer the content GPU buffer b_out
  47. # to the cpu memory (a_out),but only after `event` (= kernel execution)
  48. # has completed
  49. queue.enqueue_read_buffer(b_out,a_out,:event_wait_list => [event])
  50.  
  51. # wait for everything in the command queue to finish
  52. queue.finish
  53. # now a_out contains the result of the addition performed on the GPU
  54.  
  55. # add some cleanup here ...
  56.  
  57. # verify that the computation went well
  58. diff = (a_in - a_out*3.0)
  59. 65536.times { |i|
  60. raise "Computation error #{i} : #{diff[i]+f.s0+f.s1}" if (diff[i]+f.s0+f.s1).abs > 0.00001
  61. }
  62. puts "Success!"

猜你在找的Ruby相关文章