Objective of this document
The purpose of this guide is to offer guidelines on how to integrate ComputeCpp with an existing application.
This document covers:
- What ComputeCpp is
- The requirements of a C++ application to be able to integrate with ComputeCpp
- How to integrate ComputeCpp into an existing C++ application
- How to integrate ComputeCpp with an existing build system
- How to redistribute an application built with ComputeCpp
The following sections delve into further details on the above topics.
What is ComputeCpp?
ComputeCpp™ is a conformant implementation of the SYCL™ 1.2.1 Khronos standard for OpenCL™ 1.2 devices with SPIR™ 1.2, SPIR-V, and experimental PTX support. ComputeCpp supports AMD® and Intel® OpenCL 1.2 devices with compatible drivers. Any platform with OpenCL SPIR 1.2, SPIR-V, or PTX support should work, for example any ARM™ device that features an OpenCL platform with that capability. It also provides the capability of using the system for targeting the SYCL host device. However PTX and SPIR-V support is experimental for the time being.
The ComputeCpp SDK complements the ComputeCpp Package with build system integration, sample code and documentation. The ComputeCpp SDK is available on GitHub.
Please refer to the Platform Support Notes for details on the supported platforms.
Figure 1. ComputeCpp Package Content Diagram
The ComputeCpp Package contains the ComputeCpp implementation files. The Components of the package are shown in the diagram above and described below:
-
bin/compute++ : A device compiler that generates the integration header from a ComputeCpp file.
-
bin/computecpp_info : A tool that provides information about the platform.
-
libComputeCpp.so
: The ComputeCpp runtime library. -
include/SYCL/ : The ComputeCpp implementation headers.
These components can be used together with a build system to build SYCL programs on different platforms.
Note that the components of the ComputeCpp Package are designed to interact among themselves, hence, it is not possible to mix components of different versions or to use components in isolation.
Application requirements in order to use the ComputeCpp Package
There are two sets of requirements for the ComputeCpp Package: (1) the requirements for building an application with ComputeCpp and (2) the requirements for running an application that uses ComputeCpp.
Building applications with ComputeCpp
ComputeCpp uses a multiple compiler approach to implement the various passes on the input file. This facilitates the integration of ComputeCpp with existing build systems, since there is no need to replace the existing host compiler used for an application. The device compiler, compute++, generates an Integration Header that is later included by a host compiler as a standard C++ program header. This allows the SYCL Runtime to interact with the kernel binaries generated by the device compiler. The diagram below illustrates the build flow required for ComputeCpp to work.
Figure 2. ComputeCpp Program Build
In order to implement the multiple-compiler approach from ComputeCpp, the build system has to include the Integration Header into the compilation flow. This implies that compute++ must compile the source and produce the integration header before the host compiler processes the same file. Note that the host compiler must see the definitions contained in the header when processing the source file. There are various methods that can be utilized to achieve this result:
-
Forced inclusion of the integration header: The build system can force the inclusion of the header file before the source file is processed by the host compiler. By doing so, the developer does not have to worry about including specific files, and the build system can effectively control the creation and inclusion of the integration header.
-
Manual inclusion of the integration header: Developers can manually include the integration header at the start of their source file, like any other include file. Note that the integration header needs to exist prior to the start of the source file compilation by the host compiler. The developer must make sure to manually run compute++ in order to update the integration header every time there is a change in the SYCL source file.
Note that, in all cases, any source file containing SYCL Kernels must be processed with compute++ before the host compiler processes it. This is typically implemented in build systems via a dependency between the integration header and the SYCL source file that contains the SYCL kernels. Also, note that the integration header must be the first file included by the source file, to avoid conflicting definitions with the kernel names. Based on our experience, we recommend the forced inclusion of the integration header. The following sections will depict how to implement this method using various supported build systems.
Note that the output of compute++ is always an integration header with the .sycl extension. Refer to the compute++ documentation for details on the different options and outputs for the device compiler.
Building a SYCL application using ComputeCpp requires the availability of the OpenCL headers in addition to the OpenCL platform runtime library. This is normally part of the SDK of the chosen OpenCL platform.
Using Unix Makefile Files
Integrating the ComputeCpp Single Source Multiple Compiler Passes systems is trivial. The Trivial Makefile example below depicts a simple Makefile that generates an application executable from a source file named file.cpp
.
In order to obtain the final file.exe
binary, rule (1) is triggered. This rule uses the host compiler to parse the C++ file. Note that we use the forceinclude
optionofthehostcompilertoincludetheintegrationheaderbeforethesourcefile(file.cpp
)isprocessed.
The target file.exe
depends on two other targets, file.cpp
and file.sycl
. file.cpp
is the actual source file, which triggers a re-run of the file.exe
target when there are changes to the source file. file.sycl
is a target defined in rule (2) which triggers compute++ on the source file to produce the integration header (with the default name of file.sycl
).
Note that the file.sycl
target depends on the source file itself, file.cpp
, so that compute++ is triggered every time there is a change on the source file.
Using these two dependant targets, we can process the same source file with two separate compilers.
Trivial Makefile Example
file.sycl: file.cpp (2)
compute++ ${COMPUTECPP_DEVICE_COMPILER_FLAGS} file.cpp
file.exe : file.sycl file.cpp (1)
$(CXX) $(CXXFLAGS) -include file.sycl file.cpp -o file.exe
(1) Basic rule to trigger the creation of the executable by building the source file
(2) Rule that triggers the creation of the integration header
See the tools/Linux/Makefile directory of the ComputeCpp SDK for more examples of Makefile integration.
Using the unified compiler driver
Since version 0.4.0, compute++ features a unified driver for the compilation of SYCL applications. This unified driver allows for a single call to the compute++ to compile both host and device code, simplifying integration with existing projects and build systems. Refer to the compute++ documentation for details on how to use the driver.
Using CMake
The ComputeCpp SDK provides a FindComputeCpp.cmake module which locates the ComputeCpp package and provides custom functions to ease integration of ComputeCpp into existing CMake projects. FindComputeCpp.cmake
supports versions of CMake greater or equal to 3.4.3. See link: Platform Support Notes for details.
Developers can include the FindComputeCpp.cmake
module into their project. FindComputeCpp.cmake
provides a set of macros and variables that can be used throughout the CMake build configuration.
Note that some of the CMake variables are actually obtained at configure time by querying the computecpp_info tool for information. For example, information on the device compiler’s required flags or the current package version. This guarantees that FindComputeCpp.cmake
is always up-to-date with the latest ComputeCpp Package options. However, this requires execution permissions on the computecpp_info binary. When configuring a project that uses FindComputeCpp.cmake
, the variable COMPUTECPP_PACKAGE_ROOT_DIR
must be defined, and set to the base directory where the corresponding ComputeCpp package lives in the system. Refer to the FindComputeCpp.cmake
file itself for the latest documentation and the variables exported.
The example below illustrates the basic usage of FindComputeCpp.cmake
.
Once FindComputeCpp.cmake
is included in a CMakeLists.txt
file, a series of predefined CMake path variables (set to relevant paths of the ComputeCpp Package) are available to the project’s CMake configuration 1.
In this case, we use COMPUTECPP_INCLUDE_DIRECTORY
to add the ComputeCpp API headers to the include files processed by CMake. We use add_executable
to create an executable target in CMake that is composed of a single file (syclProgram.cpp
). Then we use add_sycl_to_target
(defined in FindComputeCpp.cmake
) to add a dependent target to syclProgram
that triggers the generation of the integration header by compute++. The add_sycl_to_target
function adds properties to the source file, for example C++11 mode for the source. Additionally, the add_sycl_to_target
function handles the forced inclusion of the integration header by using the appropriate method given for the host compiler.
Basic usage of the FindComputeCpp module
project(my_sycl_program)
cmake_minimum_required(VERSION 3.4.3)
set(CMAKE_MODULE_PATH /path/to/computecpp-sdk/cmake/Modules/)
include(FindComputeCpp)
include_directories(${COMPUTECPP_INCLUDE_DIRECTORY})
add_executable(syclProgram ${CMAKE_CURRENT_SOURCE_DIR}/syclProgram.cpp)
add_sycl_to_target(TARGET syclProgram SOURCES ${CMAKE_CURRENT_SOURCE_DIR}/syclProgram.cpp)
More complex options (such as using multiple C++ files with SYCL Kernels defined) are possible by using this module and directly using the CMake variables defined by it. A low-level build_spir
function is also provided, but using it directly is not supported. The build_spir
helper function creates a CMake custom target, which triggers the device compiler on every change to the source file.
FindComputeCpp.cmake
is used to build the sample code of the ComputeCpp SDK, and is used in various Codeplay-supported projects as the main building method for SYCL programs. We recommend using this method rather than manual compilation or custom scripts, since it can be easily adapted to future changes to the ComputeCpp toolchain.
Using Visual Studio
Visual Studio 2019 templates are provided as part of the ComputeCpp™ Community Edition installer for Windows. The installer also provides Visual Studio templates, you can learn more about that in Getting Started with ComputeCpp for Windows®.
It is also possible to generate Visual Studio Project files using CMake. Refer to the CMake documentation for details on how to achieve this.
Offline Compilation (Only available in the Professional Edition)
The ComputeCpp Professional Edition package supports offline compilation of SYCL kernels directly to a target ISA via a custom tool, therefore avoiding the overhead of JIT compilation by the ComputeCpp runtime. In order to use this feature you must invoke compute++ with the path to the tool that will be used to perform the offline compilation and the arguments to forward to it.
There are four targets you can specify which tell compute++ via
-sycl-target
which binary format the offline compilation tool will expect the
custom tool to consume:
- custom-spir32
- custom-spir64
- custom-spirv32
- custom-spirv64
If any of the above targets are specified you must then specify
--sycl-custom-tool=/path/to/my/tool
.
If you have any arguments which must be passed to this tool you can forward
these via -sycl-custom-args "my tool arguments"
.
Each of the above targets will expect the custom tool to expect the corresponding binary format and output an executable program of some ISA that can be consumed by an OpenCL device.
When a custom target has been specified to compute++ the macro
COMPUTECPP_OFFLINE_TARGET_CUSTOM
will also be defined for use within SYCL
kernel functions.
Below is an example invocation of compute++ to use offline compilation.
Offline compilation example
compute++ -sycl -sycl-target custom-spir64 --sycl-custom-tool=/path/to/my/tool
-sycl-custom-args "my tool arguments"
If one of the custom targets has been specified to compute++ the
integration header generated will instruct the ComputeCpp runtime how to execute
the program. However, the ComputeCpp runtime will not automatically know which
OpenCL device to execute the program on, so you must define a device_selector
to ensure the correct OpenCL device is chosen.
Offline compilation using -sycl-custom-command-line
Only available in the Professional Edition
An alternative to using -sycl-custom-args
is to specify the entire command line for the offline compilation tool
using -sycl-custom-command-line
.
The two flags cannot be used together.
-sycl-custom-command-line
can accept a few different placeholders inside the argument string,
which it replaces with compiler-generated values before invoking the provided tool:
{input}
: Path to input file{output}
: Path to output file{output_dirname}
: Path to output folder{output_basename}
: Base name of output file{input_dirname}
: Path to input folder{input_basename}
: Base name of input file{pathsep}
: Path separator for Windows and UNIX
Example using ocloc
:
--sycl-custom-tool=/path/to/tool/ocloc
-sycl-custom-command-line "-out_dir {output_dirname} -output {output_basename} -output-no-suffix -file {input} -device skl -spirv_input"
-sycl-custom-output-file-suffix ".gen"
ocloc
requires the output folder and output file name to be defined separately.
It also adds a suffix to the end of the output file name which cannot be disabled.
Example using spirv-ll-tool
:
--sycl-custom-tool=/path/to/tool/spirv-ll-tool
-sycl-custom-command-line "--api OpenCL -b auto -o {output} {input}"
Which is the same as:
--sycl-custom-tool=/path/to/tool/spirv-ll-tool
-sycl-custom-args "--api OpenCL -b auto"
Example using clc
:
--sycl-custom-tool=/path/to/tool/clc
-sycl-custom-command-line "-d RISC-V -o {output} {input}"
or
--sycl-custom-tool=/path/to/tool/clc
-sycl-custom-args "-d RISC-V"
Executing Applications with ComputeCpp
The ComputeCpp package contains a ComputeCpp Runtime Library that handles various aspects of the execution of SYCL applications, including (but not limited to): Command Group processing and Scheduling, Runtime Kernel Compilation and Memory management. The ComputeCpp Runtime Library interfaces between the user application and the OpenCL platform available in the system, if any.
ComputeCpp Runtime Execution Requirements
The ComputeCpp Runtime requires an OpenCL 1.2 implementation that features SPIR 1.2, SPIR-V, or PTX support. Only the OpenCL driver needs to be installed, not necessarily the vendor-provided SDK.
The ComputeCpp Runtime also requires a C++ runtime library with C++11 capabilities. This can be checked using computecpp_info tool.
Refer to the Platform Support Notes and the Changelog that accompany the ComputeCpp Package for details on the supported platforms and known issues for that specific version of the package. Note that other platforms may work but have not been tested.
ComputeCpp Runtime Execution
The diagram below shows how the ComputeCpp Runtime works during execution. The ComputeCpp Runtime spawns a scheduling thread when the first SYCL object is created. The thread is stopped when the last SYCL object is destroyed. An application using ComputeCpp may start and stop a scheduling thread multiple times. For example, if the number of live SYCL objects is reduced to zero, and then a new SYCL object is created. The ComputeCpp SYCL host device implementation may spawn additional threads to perform the execution of the kernel without affecting the scheduling of other kernels. Note that OpenCL platforms may also spawn different numbers of threads depending on the implementation. The execution on the host device uses an internal Thread Pool that is constructed on the creation of the first SYCL object, and destroyed when the last one is destroyed.
Figure 3. ComputeCpp Thread Diagram
If the ComputeCpp Runtime detects a SIGINT it will attempt to destroy the scheduling thread and the underlying OpenCL implementation safely, aborting any commands that are in progress.
If a SIGINT signal handler is specified using std::signal before the ComputeCpp scheduling thread is created, then the ComputeCpp Runtime will install a SIGINT signal handler which will call the user-specified signal handler on detection of a SIGINT.
Redistributing SYCL applications using ComputeCpp
When distributing SYCL applications that have been built using the ComputeCpp package, the ComputeCpp runtime library (included in the libComputeCpp.so
file) must accompany the application binary. The device compiler, compute++, is not redistributable according to the license terms. The pre-compiled integration headers can be re-distributed to enable compilation of a SYCL project in the absence of a device compiler. See the ComputeCpp License for legal information about re-distributing applications that use the ComputeCpp Package.
The ComputeCpp runtime library must be present when the application is executed. Note that this means that the final application will have at least the same requirements and restrictions as the ComputeCpp library.
AMD is a registered trademark of Advanced Micro Devices, Inc. Intel is a trademark of Intel Corporation or its subsidiaries in the U.S. and/or other countries. NVIDIA and CUDA are registered trademarks of NVIDIA Corporation
-
ComputeCpp CMake variables are all prefixed by
COMPUTECPP
in their macro names. ↩