What is SYCL?
SYCL (pronounced ‘sickle’) is a royalty-free, cross-platform abstraction layer that builds on the underlying concepts, portability and efficiency of OpenCL that enables code for heterogeneous processors to be written in a “single-source” style using completely standard C++. SYCL enables single source development where C++ template functions can contain both host and device code to construct complex algorithms that use OpenCL acceleration, and then re-use them throughout their source code on different types of data.
SYCL is capable of hiding a large amount of the complexities of OpenCL, substantially reducing the amount of host-side code needed over traditional OpenCL. At the same time, SYCL enables developers to continue using all OpenCL features via different parts of the SYCL API. This means that as a developer you can choose to use as much or as little of the SYCL interface as you like, matching the requirements of your application. This is a new approach to heterogeneous programming in C++ that allows you to develop software that takes advantage of low powered but highly parallel hardware and can be written using standard C++. Some examples of the C++ features supported in SYCL are templates, classes, operator overloading, static polymorphism and lambdas.
To enable parallelism SYCL offers four ways in which a kernel function can be executed. A kernel is the code that is performed on the parallel hardware. Work Group Data Parallel: a kernel function is executed using an OpenCL "nd range". An nd range specifies a 1, 2 or 3 dimensional grid of work items that each executes the kernel function, which are executed together in work groups. The nd_range consists of two 1, 2 or 3 dimensional ranges: the global work size (specifying the full range of work items) and the local work size (specifying the range of each work group). In this execution mode, synchronization within a group can be performed using barriers.
Basic Data Parallel: a kernel function is executed with a single range specifying the global work size and the local size is then determined by the SYCL host-side runtime. In this mode, there is no synchronization within workgroups.
Single Task: a kernel function is executed just once, this is effectively the same as executing an nd range of global work size { 1, 1, 1 }.
Hierarchical Data Parallel: a kernel function executes in a work group data parallel way, but SYCL provides an alternative multi leveled syntax for defining this form of parallelism. The hierarchical syntax consists of an outer parallel for work group loop that is executed for each work group in the nd range and an inner parallel for work item loop that is executed for each work item in the work group. The hierarchical syntax is a clearer way of writing parallel OpenCL code as it highlights the nature of the paralleism.
If you want to learn more about using SYCL, there is a SYCL guide, ComputeCpp Getting Started guide, Integration guide and the full SYCL specification is available on the Khronos website.