compute++(1)
Codeplay Software Ltd <sycl@codeplay.com> :doctype: manpage :manmanual: compute++ :page-layout: base
NAME
compute++ - ComputeCpp SYCL compiler based on clang 6.0
SYNOPSIS
compute++ (-sycl | -sycl-device-only | -sycl-driver) [-sycl-target
DESCRIPTION
compute++ is the integrated SYCL driver and compiler for ComputeCpp.
When using -sycl or -sycl-device-only flag, compute++ takes in a C++ source file, compiles any SYCL kernels it finds (and any functions called from kernels) and outputs the ComputeCpp integration header. The integration header is a C++ header file, generated by compute++, which contains the compiled kernel in the binary format, along with all the required information for the runtime to manage the kernels. Note that -sycl and -sycl-device-only are aliases, there is no difference between them.
When using -sycl-driver flag, compute++ act as an integrated driver combining the device and host compilation step. It takes in a C++ source file, compiles any SYCL kernels it finds (and any functions called from kernels) and outputs the ComputeCpp integration header as a temporary file. In a second compilation step, it compiles the combined C++ source file and integration header to produce an object file or executable.
The compiler is based on Clang/LLVM and can support C++11 and partially C++14 with view on supporting increasingly more C++ standards.
If none of -sycl, -sycl-device-only and -sycl-driver is used, compute++ acts as the normal clang compiler so it can be used as the host compiler too. Refer to the clang/llvm documentation for general compilation information. Note that -sycl-host-only suppress the effect of -sycl, -sycl-device-only and -sycl-driver when provided after .
EXAMPLES
The basic invocation of compute++ to produce an integration header:
compute++ device compiler invocation
compute++ -sycl -c
This constitutes the bare minimum to produce the integration header from the SYCL kernels in a C++ source file.
The option -sycl enables the compute++ SYCL specific code paths to extract and compile its kernels to SPIR and emit the integration header. The -sycl option also makes the compiler assume that the input is a C++11 source file. The other option "-c" is a standard clang option to only perform the compilation step (no linking).
The basic invocation of compute++ to produce a host object file containing the device kernels:
compute++ integrated driver invocation
compute++ -sycl-driver -c
This constitutes the bare minimum to produce an object containing host code and device kernels the SYCL kernels in a C++ source file.
The option -sycl-driver schedules calls to the compiler to first run with the -sycl flag to extract kernels and emit the integration header and then run without the -sycl flag and automatically include the integration header. The -sycl-driver option also makes the compiler assume that the input is a C++11 source file.
INTEGRATION HEADER
By using -sycl, compute++ will create an integration header. By convention, the integration header will have the extension ".sycl".
The following invocation:
compute++ -sycl -c myFile.cpp
will create the integration header as myFile.sycl in the same folder as myFile.cpp. You can have more control over the output file by using -o or -sycl-ih:
compute++ -sycl -c myFile.cpp -o integration.hpp.sycl
compute++ -sycl -c myFile.cpp -sycl-ih integration.hpp.sycl
compute++ will then create integration.hpp.sycl instead of myFile.sycl.
OPTIONS
*-no-serial-memop
Deactivate memcpy/memset intrinsics replacement by serial functions.
To work around some driver issues, memcpy and memset are replaced by serial functions (this can impact performances). If the used driver is not impacted by such issue, this option prevents this replacement and can have a positive impact on performance.
*-sycl-device-only
*-sycl
Enables the SYCL extension to C++.
This enables the SYCL kernel extraction, diagnostics and integration header production. The flags will also force the compiler to assume the input file is a C++11 file.
If the clang option -std
is also used, its argument must be a C++ standard that covers C++11. If the language standard provided to -std
is incompatible with SYCL, the language standard will be overridden to C++11.
*-sycl-driver
Enables the integrated SYCL driver.
This enable the compiler to first run in device mode (with the -sycl flag) to produce the integration header and then run in host mode with the integration header automatically included. The flags will also force the compiler to assume the input file is a C++11 file.
If the clang option -std
is also used, its argument must be a C++ standard that covers C++11. If the language standard provided to -std
is incompatible with SYCL, the language standard will be overridden to C++11.
*-sycl-host-only
Only perform the host compilation step, equivalent to omitting any of the -sycl-device-only, -sycl and -sycl-driver.
Note: Only the last of -sycl-device-only, -sycl, -sycl-driver, -sycl-host-only is considered. The flag -sycl-host-only is useful to suppress the action of the other flags.
*-sycl-compress-name
Generate compressed SYCL kernel names.
This option will rename SYCL kernel functors to shorter names for use with OpenCL device drivers. You should use this option if you get an error from an OpenCL device driver when using heavily-templated kernels. There is a risk that using this option will create name collisions in kernels when using heavily templated code, so this option is not enabled by default. However, some OpenCL drivers, such as AMD®’s GPU drivers, will require this option when kernel names have a large number of template parameters.
*-sycl-keep-unused-args
Instructs the compiler to keep unused kernel arguments.
By default, all unused SYCL kernel arguments are removed to speed up kernel enqueue at runtime. This flag disables this behavior.
*-sycl-keep-attr=
Don’t allow the compiler to remove the specified attribute.
To strictly follow SPIR 1.2 specifications, some attributes used internally are removed before the module emission. To prevent the suppression, the attribute name to keep can be specified.
Current value supported:
-
noduplicate
*-sycl-target
Specify the target of the device compilation step.
Valid values are:
-
spir: SPIR 32-bits target
-
spir64: SPIR 64-bits target
-
spirv: SPIR-V 32-bits target
-
spirv64: SPIR-V 64-bits target
-
ptx64: PTX 64-bits target (experimental)
-
custom-spir32: Custom target using a custom offline compilation tool consuming 32-bit SPIR, needs "--sycl-custom-tool" to be specified (Professional Edition only)
-
custom-spir64: Custom target using a custom offline compilation tool consuming 64-bit SPIR, needs "--sycl-custom-tool" to be specified (Professional Edition only)
-
custom-spirv32: Custom target using a custom offline compilation tool consuming 32-bit SPIR-V, needs "--sycl-custom-tool" to be specified (Professional Edition only)
-
custom-spirv64: Custom target using a custom offline compilation tool consuming 64-bit SPIR-V, needs "--sycl-custom-tool" to be specified (Professional Edition only)
If the option is emitted, the compiler will default to spir if the host target is a 32-bits architecture and spir64 if the host target is a 64-bits architecture.
Note:
-
Option -sycl-spir32 should now be replaced by -sycl-target spir
-
Option -sycl-spir64 should now be replaced by -sycl-target spir64
-
--sycl-custom-tool=\<path>::
Path to a custom tool to use as the offline compilation back-end. Needs to be specified when using one of the "custom" device compilation targets for offline compilation.
- -sycl-custom-args \<value>::
Arguments to pass to the custom tool being used as the offline compilation back-end. Arguments provided in quotes will be unquoted and passed on as separate arguments ("a b" will be passed as two arguments while " 'a b' " will be passed as one argument).
Can be specified when using one of the "custom" device compilation targets for offline compilation.
\*-sycl-ih <path>
Specify a path to output the integration header.
If no value is specified, the name will be inferred using the output file if specified using -o or the input file.
Note that this option cannot be used if multiple input files are processed.
*-fsycl-ih-last
Force the driver to include the integration header after the main source file. Note that this flag has an effect only when used with -sycl-driver.
*-fsycl-split-modules=
This option lets the compiler generate multiple modules and puts a maximum of
This may improve compilation times when only a small number of kernels are used at run-time and not every module has to be compiled. By default only one module for all kernels will be produced.
*-sycl-ocl-target-builtins-path
Specify a custom path to an OpenCL builtin implementation for the target
By default this flag will be automatically set by the Clang driver when targeting PTX. If you provide your own path it will overwrite the default. The default path is to a binary file containing the implementation of several OpenCL builtins. This option is only used for PTX targets. It will be ignored for any other target.
*-sycl-print-stats
Print performance statistics for each SYCL kernel extracted by different analyses (Professional Edition only)
Old Options
The following options were removed in a previous release, they are no longer recognized
*–sycl-no-diags
This option had no effect for several releases.
*-sycl-spir32
This option used to force the compiler to output 32-bit SPIR or SPIR-V bitcode. The target is now specified using the -sycl-target
*-sycl-spir64
This option used to force the compiler to output 64-bit SPIR or SPIR-V bitcode. The target is now specified using the -sycl-target
Clang standard options
As the compiler is based on clang 6, some of the standard clang 6 options are available. This includes:
-
Warning and error formatting flags
-
Optimisation flags
For more information about clang usage, refer to the clang 6.0 manual
Compute++-specific Diagnostics
This section describes SYCL specific diagnostics. Compute++ errors, warnings and remarks are in the form of:
[Computeccp:CCxxxx] Error message
Where "xxxx" is an error code index.
[Computecpp:CC0001]
a SYCL kernel must be named with a class, struct, enum or union type
[Computecpp:CC0002]
cannot name a SYCL kernel with
[Computecpp:CC0003]
an unnamed type is an invalid template argument for classes used to name SYCL kernels.
Valid SYCL kernel names must be a forward declarable class, struct, enum or union type.
The following examples are valid SYCL kernel names:
class ClassKernelName;
struct MyStruct;
enum class MyEnum : int;
The following types are invalid SYCL kernel names:
unsigned // not a class, struct, enum or union type
enum MyCpp98Enum {/* Definition */}; // not forward declarable
[Computecpp:CC0004]
a SYCL kernel with the name
SYCL kernel names must be unique. Therefore, the same type cannot be used twice to name 2 kernels. In such cases, compute++ will emit a note to provide the location of the original name.
class MyName;
[...]
h.parallel_for<class MyName>(...);
h.parallel_for<class MyName>(...); // CC0004: the name is used twice
[Computecpp:CC0005]
cannot name a SYCL kernel with nested class, struct, enum or union `
SYCL kernel names cannot be a nested class as it is not forward declarable.
struct MyClass {
struct Nested {};
};
h.parallel_for<MyClass::Nested>(...); // CC0005
[Computecpp:CC0006]
C++ constructions that cannot be converted into a valid OpenCL code are invalid in SYCL. These include:
-
Exceptions (try / catch expressions)
-
Accessing non-const global variables inside device code
-
Function pointers
-
RTTI (dynamic_cast, typeid)
-
Dynamic memory handling: new, delete
- Note: placement new is allowed
[Computecpp:CC0007]
cannot capture
[Computecpp:CC0008]
a variable of type `
A SYCL kernel cannot capture:
-
By reference
-
The "this" pointer
-
Raw pointers
For instance
class example;
[...]
int *ptr = new int(10); // Initialise a pointer
cgh.single_task<class example>([=]() {
foo(ptr); // Illegal to use that pointer within kernel
});
The above code would give the error:
error: a variable of type 'int *' cannot be captured by a SYCL kernel, because it is a pointer type
[Computecpp:CC0009]
cannot capture object instance of class
[Computecpp:CC0010]
cannot capture object instance of class
[Computecpp:CC0011]
cannot capture object
[Computecpp:CC0020]
SYCL kernel parameter `' of type `
A SYCL kernel cannot capture:
-
Any objects containing raw pointers
-
Any objects that do not have a C++ standard layout, including:
-
Objects with private AND public fields (either private or public is OK but not both)
-
Containing a non-static field of reference type
-
Any objects that contain virtual methods
-
For instance
class example;
struct FooClass
{
FooClass(int i): m_pMember(new int(i)) {}
int *m_pMember;
};
FooClass instance {1}; // Initialize the instance
cgh.single_task<class example>([=]() {
foo(instance.m_pMember); // Illegal to use that pointer within kernel
});
The above code would give the error:
error: can not capture object instance of type 'struct FooClass' in a SYCL kernel, because it contains a field with a pointer type
[Computecpp:CC0012]
class
[Computecpp:CC0013]
`
One or more SYCL kernel parameter restrictions have been violated, which triggered this error.
This includes the following:
-
``it is a non standard-layout type'': To pass as a kernel parameter an object, it needs to be standard-layout.
-
``it contains virtual methods'': Objects of a class that contains virtual methods cannot be supported as they are not standard-layout.
-
``it contains a field with a pointer type'': Host raw pointers cannot be used on device with OpenCL 1.2. To pass a pointer to the device, SYCL buffers and accessors must be used instead.
[Computecpp:CC0014]
SYCL kernel must return void
SYCL kernel functors or lambdas cannot return a value.
[Computecpp:CC0015]
recursion is not allowed in SYCL functions
[Computecpp:CC0016]
recursion is not allowed in SYCL functions, but a call graph cycle was detected starting in function `
SYCL restrictions disallow the use of recursive functions as OpenCL disallows their use.
CC00016 is a recursion caused by a succession of calls that forms in the end a recursive loop.
int recurse(int i) { // CC0015
return i < 0 ? 0 : recurse(i - 1);
}
int funcA(int);
int funcB(int i) {
return funcA(i - 1) + i;
}
int funcA(int i) { // CC0016
return i < 0 ? 0 : funcB(i - 1);
}
[Computecpp:CC0023]
class
This error will accompany _[Computecpp:CC0011] cannot capture object
-
Any objects containing raw pointers
-
Any objects that do not have a standard layout, including:
-
Objects with private AND public fields (either private or public is OK but not both)
-
Containing a non-static field of reference type
-
Any objects that contain virtual methods
-
[Computecpp:CC0027]
Some memcpy/memset intrinsics added by the llvm optimizer were replaced by serial functions. This is a workaround for OpenCL drivers that do not support those intrinsics. This may impact performance, consider using -no-serial-memop.
On certain OpenCL drivers we have found that certain LLVM intrinsic functions are not supported so we replace them with our own code, which will work on all drivers. However, this could stop a vendor compiler which does support these intrinsics from performing certain optimisations so we leave the option -no-serial-memop available to instruct compute++ not to remove it. Generally this remark can be ignored as the code is still valid, but there may be a performance impact.
[Computecpp:CC0028]
OpenCL requires the format string of printf to be a constant string.
This warning will throw if you attempt to pass a string non-literal as the format string of the printf function. Code like below should produce this warning.
const char * str = "somestring"
printf(str);
The preferred method of printing from a SYCL kernel is to pass a stream object into the kernel.
class example;
[...]
myQueue.submit([&](handler& handle)
stream os(1024, 80, handle);
cgh.single_task<class example>([=]() {
os << "somestring" << endl;
});
});
[Computecpp:CC0029]
The `noduplicate' attribute was removed from some functions in order to guarantee conformance with SPIR 1.2. To keep that attribute the -sycl-keep-attr=
This remark will be thrown whenever the compiler removes the attribute "noduplicate’ from a function or function call. This is not a warning or an error, it is a remark as this is expected behavior from the compiler and the code remains valid.
[Computecpp:CC0030]
sycl requires at least C++11, force language standard to C++11
The SYCL specification is defined to support C++11 features. Therefore, a SYCL program requires a language standard that includes the C++11 feature set.
The language standard provided to the "-std" command-line option is incompatible with SYCL and the language standard has been forced to C++11.
[Computecpp:CC0031]
`
The value
[Computecpp:CC0032]
Variadic functions cannot be used in SYCL device code
SYCL device functions cannot call C-style variadic functions.
Calls like below are not allowed on device.
class example;
void variadic(int, ...);
[...]
myQueue.submit([&](handler& handle)
cgh.single_task<class example>([=]() {
variadic(2, 42., "str");
});
});
[Computecpp:CC0033] (deprecated)
--sycl-no-diags is deprecated and has no effect.
The flag --sycl-no-diags is now deprecated and the use of this flag does not have any effect any more.
[Computecpp:CC0034]
Function
You will see this warning if an undefined function which does not match one of the known OpenCL builtin functions has been referenced in device code. If the function cannot be resolved by the OpenCL implementation, the kernel is unlikely to compile.
[Computecpp:CC0035]
Intrinsic
The SPIR spec only allows for memset and memcpy intrinsics to be used, anything else is technically illegal and could cause issues with the OpenCL implementation. This warning will tell you if an illegal intrinsic has been generated.
[Computecpp:CC0036]
OpenCL extension cl_khr_fp16 should be enabled before using type half
This warning will appear if cl::sycl::half (alias to __fp16) is used in code before the OpenCL extension cl_khr_fp16 is enabled. This warning or lack of does not say anything about whether or not the underlaying OpenCL implementation supports half. You must check this manually using the runtime device class "has_extension" method.
[Computecpp:CC0037]
Variable length arrays are not supported in SYCL kernels.
Variable length arrays (VLA) are disallowed by OpenCL v1.2 s6.9.d, thus cannot be used in a SYCL kernel.
Note: This is a C99 features but some compilers provide extensions to support it in C++
void foo(unsigned n) {
int vla[n]; // illegal if called from a kernel
}
[Computecpp:CC0039]
Kernel
OpenCL attributes vec_type_hint, work_group_size_hint and reqd_work_group_size are allowed to be used in SYCL. In SYCL, those attributes are applied to functions (instead of kernel function in OpenCL) and the compiler infers from the function calls which OpenCL attributes should be applied to the kernel. If a function A and a function B both use the same attribute, then they must use the same values. If they do not agree on those values, then the compiler reports an error pointing out the affected kernel and the functions causing the issue.
class Ker;
__attribute__((reqd_work_group_size(64, 64, 1)))
void A() {}
__attribute__((reqd_work_group_size(64, 1, 1)))
void B() {}
[...]
myQueue.submit([&](handler& handle)
cgh.single_task<class Ker>([=]() {
A();
B();
});
});
The example above will generate the following diagnostic
error: [Computecpp:CC0039]: Kernel Ker: conflicting values for attribute reqd_work_group_size note: [Computecpp:CC0039]: Attribute reqd_work_group_size values inherited from A note: [Computecpp:CC0039]: Registered values 'i32 64, i32 64, i32 1' note: [Computecpp:CC0039]: Attribute reqd_work_group_size values inherited from B note: [Computecpp:CC0039]: Registered values 'i32 64, i32 1, i32 1'
[Computecpp:CC0040]
Accessor appears as member of union used as a kernel argument - associated kernel may not run correctly
Using unions that contain accessor fields in kernel argument is discouraged and has an undefined behavior.
class Ker;
[...]
myQueue.submit([&](handler& handle)
union {
accessor<int, 1, access::mode::read_write, access::target::global_buffer> Acc;
int IntField;
} Arg;
cgh.single_task<class Ker>([=]() {
Arg.IntField; // CC0040
});
});
[Computecpp:CC0042]
Invalid SYCL target
The value given to -sycl-target is not a valid device targets.
[Computecpp:CC0043]
Community Edition releases do not support multiple device targets
Community edition releases are restricted to one device target per kernel.
[Computecpp:CC0047]
'-sycl-spir32' is deprecated, use '-sycl-target spir' instead '-sycl-spir64' is deprecated, use '-sycl-target spir64' instead
Options -sycl-spir32 and -sycl-spir64 are deprecated and may be removed in future releases. They both are now aliases to "-sycl-target spir" and "-sycl-target spir64" and they should be used directly.
[Computecpp:CC0047]
'-sycl-spir32' is deprecated, use '-sycl-target spir' instead '-sycl-spir64' is deprecated, use '-sycl-target spir64' instead
Options -sycl-spir32 and -sycl-spir64 are deprecated and may be removed in future releases. They both are now aliases to "-sycl-target spir" and "-sycl-target spir64" and they should be used directly.
[Computecpp:CC0048]
Kernel name
According to the SYCL specification, kernel name types must be accessible from the global namespace. If a type is locally declared (e. g. in a function), it cannot be accessed from outside that function.
myQueue.submit([&](handler& handle)
cgh.single_task
A possible solution is to forward declare the type to make it non-local
class Ker;
[...]
myQueue.submit([&](handler& handle)
cgh.single_task
[Computecpp:CC0049]
Processor
The specified processor with -msyclarch is not usable for the given target.
[Computecpp:CC0052]
During compilation of the current function for the target device, a write to constant memory was detected.
void f(int &i) { i = 0; } // CC0052 triggered by assignment to int in constant memory
[ ... ]
myQueue.submit([&](handler &handle) {
auto acc = buffer.get_access<access::mode::write, access::target::constant_buffer>(handle);
handle.single_task<class Ker>(([=]() { f(acc[0]); }));
});
[Computecpp:CC0056]
Custom target specified but the tool to use not provided. Provide the tool using --sycl-custom-tool.
A SYCL target for offline compilation using a custom tool was specified but the tool to use not provided.
[Computecpp:CC0057]
Community Edition releases do not support offline compilation
A SYCL target for offline compilation was specified but Community Edition releases do not support this feature.
[Computecpp:CC0058]
Split modules (-fsycl-split-modules
) is incompatible with some features, such as offline compilation.
[Computecpp:CC0059]
Community Edition releases do not support printing performance statistics for SYCL kernels using -sycl-print-stats
Compute++ Internal Diagnostics
The following diagnostics are internal. If the ComputeCpp package is correctly installed, please file a bug report.
[Computecpp:CC0017]
ignored invalid use of SYCL kernel attribute
[Computecpp:CC0018]
template specialization of a SYCL kernel is not a SYCL kernel without the proper attributes
[Computecpp:CC0019]
`%0' is invalid for a SYCL kernel and was ignored
[Computecpp:CC0021]
SYCL kernels can only have one functor parameter: parameter `' of type `
[Computecpp:CC0022] (deprecated)
a SYCL kernel cannot also be an OpenCL kernel
[Computecpp:CC0024]
Missing serial function equivalent for intrinsic
[Computecpp:CC0025]
Serial function
[Computecpp:CC0026]
Serial function
[Computecpp:CC0038]
Invalid SYCL attribute for function
Some internal metadata are not correctly formed.
[Computecpp:CC0041]
Redefinition of binary metadata for target
[Computecpp:CC0044]
Redefinition of kernel
[Computecpp:CC0045]
Error reading SYCL metadata file:
[Computecpp:CC0046]
Expect the same number of binary information placeholders (got ) and kernel input files (got
RESOURCES
AUTHOR
Codeplay Software Ltd <sycl@codeplay.com>
AMD is a registered trademark of Advanced Micro Devices, Inc. Intel is a trademark of Intel Corporation or its subsidiaries in the U.S. and/or other countries. NVIDIA and CUDA are registered trademarks of NVIDIA Corporation