Compute++ Experimental Compiler
ComputeCpp 2.7.0 introduced an optional new experimental device compiler. The experimental compiler is a significant rewrite of the current stable compiler with the aim of improving SYCL 2020 and C++20 support.
Note that this is considered an early preview and it may not generate correct code in all cases. The stable compiler is still supported, and support for many SYCL 2020 features will be added.
It is recommended that you use the existing stable compiler unless you need any features only available on the experimental compiler.
If you encounter any issues or have any feedback, please contact sycl@codeplay.com.
Motivation
The current stable version of the device compiler is based on an old version of LLVM (8.0.0). In addition, the algorithm used to deduce address spaces is problematic and not compatible with Unified Shared Memory (USM).
The experimental compiler is based on a new version of LLVM (an early beta version of 13.0.0), and has a new address space inference system.
Features and Limitations
Comparing the new experimental compiler to the existing compiler, there are a few notable changes:
Unified Shared Memory Support
The experimental compiler has full support for USM pointers, enabled when either
-sycl-ver
is 2020
, or -fsycl-enable-usm
is set.
LLVM 13 Features
The new compiler is based on a newer version of LLVM. This means that it has the new features, optimizations and bugfixes introduced since LLVM 8 (as used in the stable device compiler). See LLVM's release notes for more details.
Template Instantiation With Address Spaces
Note that this change will only affect you if you want to use address space qualified pointers directly. Normal SYCL code shouldn't be affected by these changes.
In the stable compiler, template instantiation happens after address space deduction has been performed. In the experimental compiler, it happens before. To explain what this means, here is a (contrived) example.
template <typename T>
void act(T *t) {
t[0] = 1.0;
};
template<>
void act(float __attribute__((opencl_global))* t) {
t[0] = 2.0;
}
[...]
// some_buff is pointing to a `float a`
auto acc = some_buff.get_access<cl::sycl::access::mode::read_write>(cgh);
cgh.single_task<class kernel>([=]() {
// acc.get_pointer().get() is a `__global float *`
float *as_pointer = acc.get_pointer().get();
act(as_pointer);
});
[...]
std::cout << "A: " << a << "\n";
In the stable compiler, "A" will be printed as 2.0, since as_pointer
will have
been deduced to __global float *
, and that specialization will be used.
In the experimental compiler, it will print 1.0, since the generic template is
used before as_pointer
is deduced to __global float *
.
Note that the new behaviour in the experimental compiler is consistent with other SYCL implementations.
Version Detection
The experimental device compiler is available as a different ComputeCpp package. You can check which version you have available by running the following from the package:
bin/compute++ --version
The experimental compiler will emit the following output:
Codeplay ComputeCpp - PE 2.7.0 Device Compiler (Experimental) - clang version 13.0.0
While the existing (stable) compiler will emit the following:
Codeplay ComputeCpp - PE 2.7.0 Device Compiler - clang version 8.0.0 (based on LLVM 8.0.0svn)
Usage
In terms of flags, the experimental compiler is almost a drop in replacement for the existing compiler. The only thing that requires consideration is configuration of the SYCL version.
Version Selection
In the existing compiler, this is done by specifying a preprocessor define, as follows:
compute++ -DSYCL_LANGUAGE_VERSION=2020
The experimental compiler requires the version to be specified using
-sycl-std=
as follows:
compute++ -sycl-std=2020
This will define SYCL_LANGUAGE_VERSION
as per the SYCL spec, so it may still
be used as a feature detection macro. Not that with the experimental compiler,
using -sycl-std=2020
may change features enabled by default (for example, USM
is enabled by default when the version is 2020).
Please also note that if you are not using -fsycl
or -sycl-driver
and so
have a dedicated host compilation step, you will have to define the macro
SYCL_LANGUAGE_VERSION
directly to use the ComputeCpp headers.
New Flags
The following new flags have been added to the experimental compiler only:
-fsycl
Compile both host and device code (same as -sycl-driver
). Note that this is
NOT the same as -sycl
which only produces device code.
-fsycl-device-only
When used with -fsycl
above, will only compile device code (the .sycl
header). This is equivalent to -sycl
.
-fsycl-host-only
When used with -fsycl
above, will only compile host code. This can be used if
compute++
is being used as the host compiler to use -sycl-std
.
-fsycl-enable-usm
/-fno-sycl-enable-usm
Enable/disable USM support for compiled device code. Note that if
-sycl-std=2020
is set, then USM will be enabled by default. Otherwise, it will
be disabled by default.
-sycl-std
Set the version of the SYCL standard to use. Either 2017
or 2020
.
New Warnings
A new warning has been added to the experimental compiler.
-Wsycl-type-inference
warning: [Computecpp:ASP005]: The type 'T1' was not fully solved and will be constrained to 'T2'
This warning is disabled by default, and must be enabled using
-Wsycl-type-inference
or -Wsycl-pedantic
.
This warning is emitted when we are solving the type of a pointer (i.e. given a
float *x
, which address space (local, private, global, ...) is it in?). If we
have no evidence on which address space it should be located in, it will be
placed in the private address space. In most cases, this is correct and the
warning can be ignored.