This section covers troubleshooting tips and solutions to common issues. If the following doesn’t fix your problem, please submit a support request via Codeplay’s community support website. We cannot provide any guarantees of support, but we will try to help. Please ensure that you are using the most recent stable release of the software before submitting a support request.
Bugs, performance, and feature requests can be reported via the oneAPI DPC++ compiler open-source repository.
Missing Devices in sycl-ls Output
If sycl-ls does not list the expected devices within the system:
- Check that there is a compatible version of the CUDA® or ROCm™ SDK installed on the system (for CUDA or HIP plugins respectively), as well as the compatible drivers. 
- Check that - nvidia-smior- rocm-smican correctly identify the devices.
- Check that the plugins are correctly loaded. This can be done by setting the environment variable - SYCL_PI_TRACEto- 1and running- sycl-lsagain. For example:- SYCL_PI_TRACE=1 sycl-ls- You should see output similar to the following: - SYCL_PI_TRACE[basic]: Plugin found and successfully loaded: libpi_opencl.so [ PluginVersion: 11.15.1 ] SYCL_PI_TRACE[basic]: Plugin found and successfully loaded: libpi_level_zero.so [ PluginVersion: 11.15.1 ] SYCL_PI_TRACE[basic]: Plugin found and successfully loaded: libpi_cuda.so [ PluginVersion: 11.15.1 ] [ext_oneapi_cuda:gpu:0] NVIDIA CUDA BACKEND, NVIDIA A100-PCIE-40GB 0.0 [CUDA 11.7]- If the plugin you’ve installed doesn’t show up in the - sycl-lsoutput, you can run it again with- SYCL_PI_TRACEthis time set to- -1to see more details of the error:- SYCL_PI_TRACE=-1 sycl-ls- Within the output, which can be quite large, you may see errors like the following: - SYCL_PI_TRACE[-1]: dlopen(/opt/intel/oneapi/compiler/2023.2.0/linux/lib/libpi_hip.so) failed with <libamdhip64.so.4: cannot open shared object file: No such file or directory> SYCL_PI_TRACE[all]: Check if plugin is present. Failed to load plugin: libpi_hip.so - The CUDA plugin requires - libcuda.soand- libcupti.sofrom the CUDA SDK.
- The HIP plugin requires - libamdhip64.sofrom ROCm.
 - Double-check your CUDA or ROCm installation and make sure that the environment is set up properly i.e. - LD_LIBRARY_PATHpoints to the correct locations to find the above libraries.
- Check that there isn’t any device filtering environment variable set such as - ONEAPI_DEVICE_SELECTOR(note that- sycl-lswill warn if this one is set), or- SYCL_DEVICE_ALLOWLIST.
- Check permissions. On POSIX access to accelerator devices is typically gated on being a member of the proper groups. For example, on Ubuntu Linux GPU access may require membership of the - videoand- rendergroups, but this can vary depending on your configuration.
Dealing with Invalid Binary Errors
When running SYCL™ applications targeting CUDA or HIP, under certain
circumstances the application may fail and report an error about an
invalid binary. For example, for CUDA it may report
CUDA_ERROR_NO_BINARY_FOR_GPU.
This means that the SYCL device selected was provided with a binary for the incorrect architecture. In that scenario, check the following points:
- Make sure your target is in -fsycl-targets and that the correct architecture matching the available hardware is specified with the flags: - Flags for CUDA: - -Xsycl-target-backend=nvptx64-nvidia-cuda --cuda-gpu-arch=<arch>
- Flags for HIP: - -Xsycl-target-backend=amdgcn-amd-amdhsa --offload-arch=<arch>
 
- Ensure that the correct SYCL device (matching the architecture that the application was built for) is selected at run-time. The environment variable - SYCL_PI_TRACE=1can be used to display more information on which device was selected, for example:- SYCL_PI_TRACE[basic]: Plugin found and successfully loaded: libpi_opencl.so [ PluginVersion: 11.16.1 ] SYCL_PI_TRACE[basic]: Plugin found and successfully loaded: libpi_level_zero.so [ PluginVersion: 11.16.1 ] SYCL_PI_TRACE[basic]: Plugin found and successfully loaded: libpi_cuda.so [ PluginVersion: 11.16.1 ] SYCL_PI_TRACE[all]: Requested device_type: info::device_type::automatic SYCL_PI_TRACE[all]: Requested device_type: info::device_type::automatic SYCL_PI_TRACE[all]: Selected device: -> final score = 1500 SYCL_PI_TRACE[all]: platform: NVIDIA CUDA BACKEND SYCL_PI_TRACE[all]: device: NVIDIA GeForce GTX 1050 Ti
- If an incorrect device is selected, the environment variable - ONEAPI_DEVICE_SELECTORmay be used to help the SYCL device selector pick the correct one - see the Environment Variables section of the Intel® oneAPI DPC++/C++ Compiler documentation.
Code execution hangs
Execution of code will hang for CUDA backend when group algorithms use double precision floating point numbers and:
- broadcast
- joint_exclusive_scan
- joint_inclusive_scan
- exclusive_scan_over_group
- inclusive_scan_over_group
are compiled with the icpx compiler. If you wish to use Group Algorithms
then the DPC++ clang++ compiler driver must be used.
See Install oneAPI for NVIDIA GPUs for more information.
Opaque pointers are only supported in ‘-opaque-pointers’ mode
In some cases when building for spir and nvidia or amd targets the compiler will fail with an error as follows:
clang-offload-bundler: error: Opaque pointers are only supported in -opaque-pointers mode (Producer: 'Intel.oneAPI.DPCPP.Compiler_2023.2.0' Reader: 'Intel.oneAPI.DPCPP.Compiler_2023.2.0')
The workaround for this issue is usually to swap the order of the targets in the
-fsycl-targets option.
For example the following command will fail:
$ icpx -fsycl -fsycl-targets=spir64,nvptx64-nvidia-cuda test.cpp -c -o test.o
clang-offload-bundler: error: Opaque pointers are only supported in -opaque-pointers mode (Producer: 'Intel.oneAPI.DPCPP.Compiler_2023.2.0' Reader: 'Intel.oneAPI.DPCPP.Compiler_2023.2.0')
icpx: error: clang-offload-bundler command failed with exit code 1 (use -v to see invocation)
But the following command will succeed:
$ icpx -fsycl -fsycl-targets=nvptx64-nvidia-cuda,spir64 test.cpp -c -o test.o
This is a bug in the compiler due to the fact that Nvidia and AMD targets have switched to opaque pointers but spir targets haven’t switched yet. This will be fixed in future versions of the compiler.
Unresolved extern function ‘…’ / Undefined external symbol ‘…’
This may be caused by a number of things.
- There is currently no support for - std::complexin DPC++. Please use- sycl::complexinstead.
- The - icpxcompiler driver uses- -ffast-mathmode by default, which can currently lead to some issues resolving certain math functions such as- ldexpor- logf. This can be worked around by disabling- -ffast-mathwith the- -fno-fast-mathflag.- See Install oneAPI for NVIDIA GPUs for more information. 
Compiler Error: “cannot find libdevice”
If the CUDA SDK is not installed in a standard location, clang++ may
fail to find it - leading to errors during compilation such as:
clang-16: error: cannot find libdevice for sm_50; provide path to different CUDA installation via '--cuda-path', or pass '-nocudalib' to build without linking with libdevice
To fix this issue, specify the path to your CUDA installation using the
--cuda-path option.
Compiler Error: “needs target feature”
Some nvptx builtins that are used by the DPC++ runtime require a minimum
Compute Capability in order to compile. If you have not targeted a
sufficient Compute Capability for a builtin that you’re using in your
program (by using the compiler argument
-Xsycl-target-backend=nvptx64-nvidia-cuda --cuda-gpu-arch=sm_xx), an
error with the following pattern will be reported:
error: '__builtin_name' needs target feature (sm_70|sm_72|..),...
In order to avoid such an error, ensure that you are compiling for a device
with a sufficient Compute Capability.
If you are still getting such an error despite passing a supported Compute
Capability to the compiler, this may be because you are passing the 32-bit
triple, nvptx-nvidia-cuda to -fsycl-targets. The
nvptx-nvidia-cuda triple does not allow the compilation of target
feature builtins and is not officially supported by DPC++. The 64-bit
triple, nvptx64-nvidia-cuda, is supported by all modern NVIDIA® devices,
so it is always recommended.
Compiler Warning: “CUDA version is newer than the latest supported version”
Depending on the CUDA version used with the release, the compiler may output the following warning:
clang++: warning: CUDA version is newer than the latest supported version 12.1 [-Wunknown-cuda-version]
In most cases this warning can safely be ignored. It simply means that DPC++ may not use some of the latest CUDA features, but it should still work perfectly fine in most scenarios.
 
             
            