Compute++ Compiler

This guide was created for versions: v0.8.0 - Latest

NAME

compute++ - ComputeCpp SYCL compiler based on clang 6.0

SYNOPSIS

compute++ (-sycl | -sycl-device-only | -sycl-driver) [-sycl-target <target>] [options] <input file>

DESCRIPTION

compute++ is the integrated SYCL driver and compiler for ComputeCpp.

When using -sycl or -sycl-device-only flag, compute++ takes in a C++ source file, compiles any SYCL kernels it finds (and any functions called from kernels) and outputs the ComputeCpp integration header. The integration header is a C++ header file, generated by compute++, which contains the compiled kernel in the binary format, along with all the required information for the runtime to manage the kernels. Note that -sycl and -sycl-device-only are aliases, there is no difference between them.

When using -sycl-driver flag, compute++ act as an integrated driver combining the device and host compilation step. It takes in a C++ source file, compiles any SYCL kernels it finds (and any functions called from kernels) and outputs the ComputeCpp integration header as a temporary file. In a second compilation step, it compiles the combined C++ source file and integration header to produce an object file or executable.

The compiler is based on Clang/LLVM and can support C++11 and partially C++14 with view on supporting increasingly more C++ standards.

If none of -sycl, -sycl-device-only and -sycl-driver is used, compute++ acts as the normal clang compiler so it can be used as the host compiler too. Refer to the clang/llvm documentation for general compilation information. Note that -sycl-host-only suppress the effect of -sycl, -sycl-device-only and -sycl-driver when provided after .

EXAMPLES

The basic invocation of compute++ to produce an integration header:

compute++ device compiler invocation

compute++ -sycl -c

This constitutes the bare minimum to produce the integration header from the SYCL kernels in a C++ source file.

The option -sycl enables the compute++ SYCL specific code paths to extract and compile its kernels to SPIR and emit the integration header. The -sycl option also makes the compiler assume that the input is a C++11 source file. The other option "-c" is a standard clang option to only perform the compilation step (no linking).

The basic invocation of compute++ to produce a host object file containing the device kernels:

compute++ integrated driver invocation

compute++ -sycl-driver -c

This constitutes the bare minimum to produce an object containing host code and device kernels the SYCL kernels in a C++ source file.

The option -sycl-driver schedules calls to the compiler to first run with the -sycl flag to extract kernels and emit the integration header and then run without the -sycl flag and automatically include the integration header. The -sycl-driver option also makes the compiler assume that the input is a C++11 source file.

INTEGRATION HEADER

By using -sycl, compute++ will create an integration header. By convention, the integration header will have the extension ".sycl".

The following invocation:

compute++ -sycl -c myFile.cpp

will create the integration header as myFile.sycl in the same folder as myFile.cpp. You can have more control over the output file by using -o or -sycl-ih:

compute++ -sycl -c myFile.cpp -o integration.hpp.sycl
compute++ -sycl -c myFile.cpp -sycl-ih integration.hpp.sycl

compute++ will then create integration.hpp.sycl instead of myFile.sycl.

OPTIONS

*-no-serial-memop

Deactivate memcpy/memset intrinsics replacement by serial functions.

To work around some driver issues, memcpy and memset are replaced by serial functions (this can impact performances). If the used driver is not impacted by such issue, this option prevents this replacement and can have a positive impact on performance.

*-sycl-device-only
*-sycl

Enables the SYCL extension to C++.

This enables the SYCL kernel extraction, diagnostics and integration header production. The flags will also force the compiler to assume the input file is a C++11 file.

If the clang option -std is also used, its argument must be a C++ standard that covers C++11. If the language standard provided to -std is incompatible with SYCL, the language standard will be overridden to C++11.

*-sycl-driver

Enables the integrated SYCL driver.

This enable the compiler to first run in device mode (with the -sycl flag) to produce the integration header and then run in host mode with the integration header automatically included. The flags will also force the compiler to assume the input file is a C++11 file.

If the clang option -std is also used, its argument must be a C++ standard that covers C++11. If the language standard provided to -std is incompatible with SYCL, the language standard will be overridden to C++11.

*-sycl-host-only

Only perform the host compilation step, equivalent to omitting any of the -sycl-device-only, -sycl and -sycl-driver.

Note: Only the last of -sycl-device-only, -sycl, -sycl-driver, -sycl-host-only is considered. The flag -sycl-host-only is useful to suppress the action of the other flags.

*-sycl-compress-name

Generate compressed SYCL kernel names.

This option will rename SYCL kernel functors to shorter names for use with OpenCL device drivers. You should use this option if you get an error from an OpenCL device driver when using heavily-templated kernels. There is a risk that using this option will create name collisions in kernels when using heavily templated code, so this option is not enabled by default. However, some OpenCL drivers, such as AMD®’s GPU drivers, will require this option when kernel names have a large number of template parameters.

*-sycl-keep-unused-args

Instructs the compiler to keep unused kernel arguments.

By default, all unused SYCL kernel arguments are removed to speed up kernel enqueue at runtime. This flag disables this behavior.

*-sycl-keep-attr=<value>

Don’t allow the compiler to remove the specified attribute.

To strictly follow SPIR 1.2 specifications, some attributes used internally are removed before the module emission. To prevent the suppression, the attribute name to keep can be specified.

Current value supported:

  • noduplicate

    *-sycl-target <target>

    Specify the target of the device compilation step.

Valid values are:

  • spir: SPIR 32-bits target

  • spir64: SPIR 64-bits target

  • spirv: SPIR-V 32-bits target

  • spirv64: SPIR-V 64-bits target

  • ptx64: PTX 64-bits target (experimental)

If the option is emitted, the compiler will default to spir if the host target is a 32-bits architecture and spir64 if the host target is a 64-bits architecture.

Note:

  • Option -sycl-spir32 should now be replaced by -sycl-target spir

  • Option -sycl-spir64 should now be replaced by -sycl-target spir64

    *-sycl-ih <path>

    Specify a path to output the integration header.

If no value is specified, the name will be inferred using the output file if specified using -o or the input file.

Note that this option cannot be used if multiple input files are processed.

*-fsycl-ih-last

Force the driver to include the integration header after the main source file. Note that this flag has an effect only when used with -sycl-driver.

*-sycl-ocl-target-builtins-path <path>

Specify a custom path to an OpenCL builtin implementation for the target

By default this flag will be automatically set by the Clang driver when targeting PTX. If you provide your own path it will overwrite the default. The default path is to a binary file containing the implementation of several OpenCL builtins. This option is only used for PTX targets. It will be ignored for any other target.

Old Options

The following options were removed in a previous release, they are no longer recognized

*–sycl-no-diags

This option had no effect for several releases.

*-sycl-spir32

This option used to force the compiler to output 32-bit SPIR or SPIR-V bitcode. The target is now specified using the -sycl-target <target> option.

*-sycl-spir64

This option used to force the compiler to output 64-bit SPIR or SPIR-V bitcode. The target is now specified using the -sycl-target <target> option.

Clang standard options

As the compiler is based on clang 6.0, some of the standard clang 3.9 options are available. This includes:

  • Warning and error formatting flags

  • Optimisation flags

For more information about clang usage, refer to the clang 6.0 manual

Compute++-specific Diagnostics

This section describes SYCL specific diagnostics. Compute++ errors, warnings and remarks are in the form of:

[Computeccp:CCxxxx] Error message

Where "xxxx" is an error code index.

[Computecpp:CC0001]

a SYCL kernel must be named with a class, struct, enum or union type

[Computecpp:CC0002]

cannot name a SYCL kernel with <type> - class, struct, enum or union expected

[Computecpp:CC0003]

an unnamed type is an invalid template argument for
classes used to name SYCL kernels.

Valid SYCL kernel names must be a forward declarable class, struct, enum or union type.

The following examples are valid SYCL kernel names:

class ClassKernelName;

struct MyStruct;

enum class MyEnum : int;

The following types are invalid SYCL kernel names:

unsigned // not a class, struct, enum or union type

enum MyCpp98Enum {/* Definition */}; // not forward declarable

[Computecpp:CC0004]

a SYCL kernel with the name <type> (alias <type>) already exists

SYCL kernel names must be unique. Therefore, the same type cannot be used twice to name 2 kernels. In such cases, compute++ will emit a note to provide the location of the original name.

h.parallel_for<class MyName>(...);
h.parallel_for<class MyName>(...); // CC0004: the name is used twice

[Computecpp:CC0005]

cannot name a SYCL kernel with nested class, struct, enum or union `<type>'

SYCL kernel names cannot be a nested class as it is not forward declarable.

struct MyClass {
  struct Nested {};
};

h.parallel_for<MyClass::Nested>(...); // CC0005

[Computecpp:CC0006]

<C++ construct> is not allowed within SYCL device code

C++ constructions that cannot be converted into a valid OpenCL code are invalid in SYCL. These include:

  • Exceptions (try / catch expressions)

  • Accessing non-const global variables inside device code

  • Function pointers

  • RTTI (dynamic_cast, typeid)

  • Dynamic memory handling: new, delete

    • Note: placement new is allowed

[Computecpp:CC0007]

cannot capture <expr> in a SYCL kernel

[Computecpp:CC0008]

a variable of type `<type>' cannot be captured by a SYCL kernel, because it is <reason>

A SYCL kernel cannot capture:

  • By reference

  • The "this" pointer

  • Raw pointers

For instance

int *ptr = new int(10); // Initialise a pointer
cgh.single_task<class example>([=]() {
    foo(ptr); // Illegal to use that pointer within kernel
});

The above code would give the error:

error: a variable of type 'int *' cannot be captured by a SYCL kernel, because it is a pointer type

[Computecpp:CC0009]

cannot capture object instance of class <class name> containing field <field name> of type `<type>' in a SYCL kernel

[Computecpp:CC0010]

cannot capture object instance of class <class name> containing field <field name> of type `<type>' in a SYCL kernel (makes class <class name> non-standard layout)

[Computecpp:CC0011]

cannot capture object <variable> of type `<type>' in a SYCL kernel, because <reason>

[Computecpp:CC0020]

SYCL kernel parameter `<param>' of type `<type>' is invalid: <reason>

A SYCL kernel cannot capture:

  • Any objects containing raw pointers

  • Any objects that do not have a C++ standard layout, including:

    • Objects with private AND public fields (either private or public is OK but not both)

    • Containing a non-static field of reference type

    • Any objects that contain virtual methods

For instance

struct FooClass
{
  FooClass(int i): m_pMember(new int(i)) {}

  int *m_pMember;
};

FooClass instance {1}; // Initialize the instance

cgh.single_task<class example>([=]() {
    foo(instance.m_pMember); // Illegal to use that pointer within kernel
});

The above code would give the error:

 error: can not capture object instance of type 'struct FooClass' in a SYCL kernel, because it contains a field with a pointer type

[Computecpp:CC0012]

class <name> cannot be a parameter to a SYCL kernel, because <reason>

[Computecpp:CC0013]

`<type>' is an invalid argument type for a SYCL kernel, because it is <reason>

One or more SYCL kernel parameter restrictions have been violated, which triggered this error.

This includes the following:

  • ``it is a non standard-layout type'': To pass as a kernel parameter an object, it needs to be standard-layout.

  • ``it contains virtual methods'': Objects of a class that contains virtual methods cannot be supported as they are not standard-layout.

  • ``it contains a field with a pointer type'': Host raw pointers cannot be used on device with OpenCL 1.2. To pass a pointer to the device, SYCL buffers and accessors must be used instead.

[Computecpp:CC0014]

SYCL kernel must return void

SYCL kernel functors or lambdas cannot return a value.

[Computecpp:CC0015]

recursion is not allowed in SYCL functions

[Computecpp:CC0016]

recursion is not allowed in SYCL functions, but a call graph cycle was detected starting in function `<function name>'

SYCL restrictions disallow the use of recursive functions as OpenCL disallows their use.

CC00016 is a recursion caused by a succession of calls that forms in the end a recursive loop.

int recurse(int i) { // CC0015
    return i < 0 ? 0 : recurse(i - 1);
}
int funcA(int);

int funcB(int i) {
  return funcA(i - 1) + i;
}

int funcA(int i) { // CC0016
    return i < 0 ? 0 : funcB(i - 1);
}

[Computecpp:CC0023]

class <name> is not standard layout, because <reason>

This error will accompany [Computecpp:CC0011] cannot capture object <variable> of type `<type>' in a SYCL kernel, because <reason> and will be thrown on the object definition directly. The reasons are the same as in that error.

  • Any objects containing raw pointers

  • Any objects that do not have a standard layout, including:

    • Objects with private AND public fields (either private or public is OK but not both)

    • Containing a non-static field of reference type

    • Any objects that contain virtual methods

[Computecpp:CC0027]

Some memcpy/memset intrinsics added by the llvm optimizer were replaced by serial functions. This is a workaround for OpenCL drivers that do not support those intrinsics. This may impact performance, consider using -no-serial-memop.

On certain OpenCL drivers we have found that certain LLVM intrinsic functions are not supported so we replace them with our own code, which will work on all drivers. However, this could stop a vendor compiler which does support these intrinsics from performing certain optimisations so we leave the option -no-serial-memop available to instruct compute++ not to remove it. Generally this remark can be ignored as the code is still valid, but there may be a performance impact.

[Computecpp:CC0028]

OpenCL requires the format string of printf to be a constant string.

This warning will throw if you attempt to pass a string non-literal as the format string of the printf function. Code like below should produce this warning.

const char * str = "somestring"
printf(str);

The preferred method of printing from a SYCL kernel is to pass a stream object into the kernel.

 myQueue.submit([&](handler& handle)
   stream os(1024, 80, handle);
      cgh.single_task<class example>([=]() {
        os << "somestring" << endl;
      });
 });

[Computecpp:CC0029]

The `noduplicate' attribute was removed from some functions in order to guarantee conformance with SPIR 1.2. To keep that attribute the -sycl-keep-attr=<noduplicate> command-line option can be used">

This remark will be thrown whenever the compiler removes the attribute "noduplicate’ from a function or function call. This is not a warning or an error, it is a remark as this is expected behavior from the compiler and the code remains valid.

[Computecpp:CC0030]

sycl requires at least C++11, force language standard to C++11

The SYCL specification is defined to support C++11 features. Therefore, a SYCL program requires a language standard that includes the C++11 feature set.

The language standard provided to the "-std" command-line option is incompatible with SYCL and the language standard has been forced to C++11.

[Computecpp:CC0031]

`<string>' is not a valid input to spir-keep-attr. Attribute `noduplicate' is the only supported argument

The value <value> given to -sycl-keep-attr is invalid. Currently, only noduplicate is a supported attribute. See -sycl-keep-attr help.

[Computecpp:CC0032]

Variadic functions cannot be used in SYCL device code

SYCL device functions cannot call C-style variadic functions.

Calls like below are not allowed on device.

void variadic(int, ...);

[...]

myQueue.submit([&](handler& handle)
   cgh.single_task<class example>([=]() {
     variadic(2, 42., "str");
   });
});

[Computecpp:CC0033]

--sycl-no-diags is deprecated and has no effect.

The flag --sycl-no-diags is now deprecated and the use of this flag does not have any effect any more.

[Computecpp:CC0034]

Function <name> is undefined but referenced on the device and the associated kernels may fail to build or execute at run time

You will see this warning if an undefined function which does not match one of the known OpenCL builtin functions has been referenced in device code. If the function cannot be resolved by the OpenCL implementation, the kernel is unlikely to compile.

[Computecpp:CC0035]

Intrinsic <intrinsic_name> has been generated in function <function_name> which is illegal in SPIR and may result in a compilation failure

The SPIR spec only allows for memset and memcpy intrinsics to be used, anything else is technically illegal and could cause issues with the OpenCL implementation. This warning will tell you if an illegal intrinsic has been generated.

[Computecpp:CC0036]

OpenCL extension cl_khr_fp16 should be enabled before using type half

This warning will appear if cl::sycl::half (alias to __fp16) is used in code before the OpenCL extension cl_khr_fp16 is enabled. This warning or lack of does not say anything about whether or not the underlaying OpenCL implementation supports half. You must check this manually using the runtime device class "has_extension" method.

[Computecpp:CC0037]

Variable length arrays are not supported in SYCL kernels.

Variable length arrays (VLA) are disallowed by OpenCL v1.2 s6.9.d, thus cannot be used in a SYCL kernel.

Note: This is a C99 features but some compilers provide extensions to support it in C++

void foo(unsigned n) {
  int vla[n]; // illegal if called from a kernel
}

[Computecpp:CC0039]

Kernel <SYCL kernel name>: conflicting values for attribute <OpenCL attribute name>
Attribute <OpenCL attribute name> values inherited from <Function name>
Registered values '<OpenCL attribute values>'

OpenCL attributes vec_type_hint, work_group_size_hint and reqd_work_group_size are allowed to be used in SYCL. In SYCL, those attributes are applied to functions (instead of kernel function in OpenCL) and the compiler infers from the function calls which OpenCL attributes should be applied to the kernel. If a function A and a function B both use the same attribute, then they must use the same values. If they do not agree on those values, then the compiler reports an error pointing out the affected kernel and the functions causing the issue.

__attribute__((reqd_work_group_size(64, 64, 1)))
void A() {}
__attribute__((reqd_work_group_size(64, 1, 1)))
void B() {}

[...]

myQueue.submit([&](handler& handle)
   cgh.single_task<class Ker>([=]() {
     A();
     B();
   });
});

The example above will generate the following diagnostic

error: [Computecpp:CC0039]: Kernel Ker: conflicting values for attribute reqd_work_group_size
note: [Computecpp:CC0039]: Attribute reqd_work_group_size values inherited from A
note: [Computecpp:CC0039]: Registered values 'i32 64, i32 64, i32 1'
note: [Computecpp:CC0039]: Attribute reqd_work_group_size values inherited from B
note: [Computecpp:CC0039]: Registered values 'i32 64, i32 1, i32 1'

[Computecpp:CC0040]

Accessor appears as member of union used as a kernel argument - associated kernel may not run correctly

Using unions that contain accessor fields in kernel argument is discouraged and has an undefined behavior.

[...]
myQueue.submit([&](handler& handle)
   union {
     accessor<int, 1, access::mode::read_write, access::target::global_buffer> Acc;
     int IntField;
   } Arg;
   cgh.single_task<class Ker>([=]() {
     Arg.IntField; // CC0040
   });
});

[Computecpp:CC0041]

Invalid SYCL target <target>

The value given to -sycl-target is not a valid device targets.

[Computecpp:CC0042]

Community Edition releases do not support multiple device targets

Community edition releases are restricted to one device target per kernel.

Compute++ Internal Diagnostics

The following diagnostics are internal. If the ComputeCpp package is correctly installed, please file a bug report.

[Computecpp:CC0017]

ignored invalid use of SYCL kernel attribute

[Computecpp:CC0018]

template specialization of a SYCL kernel is not a SYCL kernel without the proper attributes

[Computecpp:CC0019]

`%0' is invalid for a SYCL kernel and was ignored

[Computecpp:CC0021]

SYCL kernels can only have one functor parameter: parameter `<param>' of type `<type>' cannot be one, because parameter `<param>' of type `<type>' already is

[Computecpp:CC0022]

a SYCL kernel cannot also be an OpenCL kernel

[Computecpp:CC0024]

Missing serial function equivalent for intrinsic <name> (expected prototype <name> for serial function)

[Computecpp:CC0025]

Serial function <name> has no body (equivalent for intrinsic <name>)

[Computecpp:CC0026]

Serial function <name> has an incorrect signature (equivalent for intrinsic <name>)

[Computecpp:CC0038]

Invalid SYCL attribute for function <reason>
Invalid OpenCL attribute for function <reason>

Some internal metadata are not correctly formed.

[Computecpp:CC0043]

Redefinition of binary metadata for target <target> (<arch-size> bits)

[Computecpp:CC0044]

Redefinition of kernel <kernel name> metadata

[Computecpp:CC0045]

Error reading SYCL metadata file: <file>

[Computecpp:CC0046]

Expect the same number of binary information placeholders (got <i>) and kernel input files (got <j>)

AUTHOR

Codeplay Software Ltd <sycl@codeplay.com>

AMD is a registered trademark of Advanced Micro Devices, Inc. Intel is a trademark of Intel Corporation or its subsidiaries in the U.S. and/or other countries. NVIDIA and CUDA are registered trademarks of NVIDIA Corporation