HACKER Q&A
📣 bjourne

What is your experience with OpenCL?


Anyone here use it for HPC? Very difficult to know if it will be big in the future or if it is a failed attempt.


  👤 ColonelPhantom Accepted Answer ✓
I don't think any GPU vendors really back OpenCL anymore. I honestly doubt it will see much if any growth in the future.

Of course, the most used stack is currently CUDA, which is proprietary and only works on Nvidia.

AMD made its own version called HIP, which translates quite directly to CUDA, but also runs on AMD cards via ROCm.

Intel has its own stack called oneAPI, which I think is based on SYCL, a Khronos standard (just like OpenCL/OpenGL/Vulkan/etc). I believe SYCL programs can also be ran on AMD and NVIDIA using third-party compilers/translation tools such as hipSYCL, and I think SYCL can also be compiled to OpenCL.

I recently also heard about efforts to support running HIP programs on SYCL, so hopefully soon GPU compute will be less vendor-bound, with CUDA translating relatively easily to HIP, and HIP and SYCL also not offering compatibility problems either way.


👤 rhn_mk1
It's not in nvidia's interest to promote a competitor to their own solution (locked to their GPUs exclusively). AMD lacks the interest to push ROCm to consumers, but instead seems to be focusing on HPC deployments, meaning that they don't have to be interoperable either.

The only hope is that Intel pushes OpenCL as one of the selling points of their new Arc GPUs. They seem to be starting with the gaming market, however, which isn't very interested in computing. It could be that they have plans to attack the non-HPC market, in which case it would make sense to follow OpenCL.

At the same time, Intel is developing oneAPI, so it may make more sense to look at that instead of OpenCL.


👤 saltcured
I found OpenCL useful for a narrowly focused project at work. I used the Python OpenCL bindings. My experience is somewhat dated as I did most of this about 4-5 years ago and then stopped paying much attention to it. That code continues to run in production without any real maintenance, which I consider a point in favor of it. I.e. it just gets redeployed to newer OS and Python installs and continues to work.

From others' comments here, it seems there hasn't been much development since then in OpenCL land. I imagine details of device support might vary over generations of hardware and driver releases... I made use of half-precision float storage but ran compute at single-precision since my devices did not offer fast half-precision math.

This was small "hpc", i.e. a task that would run on one workstation or modest single socket or dual socket x86_64 server or VM. The most performance-sensitive aspect was that it was also used during the data loading/startup phase of an interactive tool. So a human user was impatiently waiting for results. The first prototype was just using Python numpy routines for convolutions, etc. I used OpenCL to get running time from many minutes down to tens of seconds and called it good enough.

I enjoyed that I could run the same code on a workstation GPU with the NVIDIA driver or via x86 multithreaded SIMD using the Intel driver. I did not do any real work with Intel nor AMD GPUs because of the hardware selection we had on hand. I also needed more than 6GB GPU RAM for it to be worth using. Even my Titan X with 12GB was only about 2-3x faster than using x86 SIMD for my problem, due to the complex tradeoffs between RAM and bus bandwidths for data transfers. This is after I'd already done some algorithmic optimizations to bring down the compute/IO ratio.

A big thing I hear others talk about is the rich development tools with CUDA and the relatively impoverished OpenCL tooling. I am old school enough to get along without it. I was able to compensate for limited Python OpenCL tools by doing some of my development and debugging cycle embedded in hacked up variants of my own viz tools. You might think of this a bit like debugging with printf, except my print statement could send a dense 3D array into an OpenGL-based renderer on my workstation.