info

Please note that you are viewing a guide targeting an older version of oneAPI for AMD GPUs (beta). This guide was designed for version 2024.0.0.

Troubleshooting

This section covers troubleshooting tips and solutions to common issues. If the following doesn’t fix your problem, please submit a support request via Codeplay’s community support website. We cannot provide any guarantees of support, but we will try to help. Please ensure that you are using the most recent stable release of the software before submitting a support request.

Bugs, performance, and feature requests can be reported via the oneAPI DPC++ compiler open-source repository.

Missing Devices in `sycl-ls` Output

link

If sycl-ls does not list the expected devices within the system:

Check that there is a compatible version of the CUDA® or ROCm™ SDK installed on the system (for CUDA or HIP plugins respectively), as well as the compatible drivers.
Check that nvidia-smi or rocm-smi can correctly identify the devices.
Check that the plugins are correctly loaded. This can be done by setting the environment variable SYCL_PI_TRACE to 1 and running sycl-ls again. For example:
```
SYCL_PI_TRACE=1 sycl-ls
```
You should see output similar to the following:
```
SYCL_PI_TRACE[basic]: Plugin found and successfully loaded: libpi_opencl.so [ PluginVersion: 11.15.1 ]
SYCL_PI_TRACE[basic]: Plugin found and successfully loaded: libpi_level_zero.so [ PluginVersion: 11.15.1 ]
SYCL_PI_TRACE[basic]: Plugin found and successfully loaded: libpi_cuda.so [ PluginVersion: 11.15.1 ]
[ext_oneapi_cuda:gpu:0] NVIDIA CUDA BACKEND, NVIDIA A100-PCIE-40GB 0.0 [CUDA 11.7]
```
If the plugin you’ve installed doesn’t show up in the sycl-ls output, you can run it again with SYCL_PI_TRACE this time set to -1 to see more details of the error:
```
SYCL_PI_TRACE=-1 sycl-ls
```
Within the output, which can be quite large, you may see errors like the following:
```
SYCL_PI_TRACE[-1]: dlopen(/opt/intel/oneapi/compiler/2024.0.0/linux/lib/libpi_hip.so) failed with <libamdhip64.so.4: cannot open shared object file: No such file or directory>
SYCL_PI_TRACE[all]: Check if plugin is present. Failed to load plugin: libpi_hip.so
```
- The CUDA plugin requires libcuda.so and libcupti.so from the CUDA SDK.
- The HIP plugin requires libamdhip64.so from ROCm.
Double-check your CUDA or ROCm installation and make sure that the environment is set up properly i.e. LD_LIBRARY_PATH points to the correct locations to find the above libraries.
Check that there isn’t any device filtering environment variable set such as ONEAPI_DEVICE_SELECTOR (note that sycl-ls will warn if this one is set), or SYCL_DEVICE_ALLOWLIST.
Check permissions. On POSIX access to accelerator devices is typically gated on being a member of the proper groups. For example, on Ubuntu Linux GPU access may require membership of the video and render groups, but this can vary depending on your configuration.

Dealing with Invalid Binary Errors

link

Incorrect Platform

link

A common mistake is to execute a SYCL program using a platform for which the SYCL program does not have a compatible binary. For example the SYCL program may have been compiled for a SPIR-V backend but then executed on a HIP device. In such a case the following error code, PI_ERROR_INVALID_BINARY, will be thrown. In this scenario, check the following points:

Make sure your target platform is in -fsycl-targets so that the program will be compiled for the required platform(s).
Make sure that the program is using a SYCL platform or device selector that is compatible with the platforms for which the executable was compiled. Try running with the environment variable SYCL_PI_TRACE=1 to print which device is being selected at runtime.

Correct Platform, Incorrect Device

link

When running SYCL™ applications targeting CUDA or HIP, under certain circumstances the application may fail and report an error about an invalid binary. For example, for CUDA it may report CUDA_ERROR_NO_BINARY_FOR_GPU.

This means that the SYCL device selected was provided with a binary for the correct platform but an incorrect architecture. In that scenario, check the following points:

Make sure your target is in -fsycl-targets and that the correct architecture matching the available hardware is specified with the flags:
- Flags for CUDA: -Xsycl-target-backend=nvptx64-nvidia-cuda --cuda-gpu-arch=<arch>
- Flags for HIP: -Xsycl-target-backend=amdgcn-amd-amdhsa --offload-arch=<arch>

Ensure that the correct SYCL device (matching the architecture that the application was built for) is selected at run-time. The environment variable SYCL_PI_TRACE=1 can be used to display more information on which device was selected, for example:

SYCL_PI_TRACE[basic]: Plugin found and successfully loaded: libpi_opencl.so [ PluginVersion: 11.16.1 ]
SYCL_PI_TRACE[basic]: Plugin found and successfully loaded: libpi_level_zero.so [ PluginVersion: 11.16.1 ]
SYCL_PI_TRACE[basic]: Plugin found and successfully loaded: libpi_cuda.so [ PluginVersion: 11.16.1 ]
SYCL_PI_TRACE[all]: Requested device_type: info::device_type::automatic
SYCL_PI_TRACE[all]: Requested device_type: info::device_type::automatic
SYCL_PI_TRACE[all]: Selected device: -> final score = 1500
SYCL_PI_TRACE[all]:   platform: NVIDIA CUDA BACKEND
SYCL_PI_TRACE[all]:   device: NVIDIA GeForce GTX 1050 Ti

If an incorrect device is selected, the environment variable ONEAPI_DEVICE_SELECTOR may be used to help the SYCL device selector pick the correct one - see the Environment Variables section of the Intel® oneAPI DPC++/C++ Compiler documentation.

Unresolved extern function ‘…’ / Undefined external symbol ‘…’

link

This may be caused by a number of things.

There is currently no support for std::complex in DPC++. Please use sycl::complex instead.

Some C++ standard library math functionality declared in <cmath> (such as std::cos, logf, sinf``) is not supported in kernel code for the AMDGPU backend of DPC++. Please use the equivalent sycl namespaced versions instead.

See Install oneAPI for AMD GPUs (beta) for more information.

Sub-group size issues in codes ported across platforms/architectures

link

Consider code that uses the kernel attribute reqd_sub_group_size to set a specific sub-group size that is then ported to a different platform or executed on a different architecture to the one it was originally written for. In such a case if the requested sub-group size is not supported by the platform/architecture then a runtime error will be thrown:

Sub-group size x is not supported on the device

On the CUDA platform only a single sub-group size is supported, hence only a warning is given:

CUDA requires sub_group size 32

and the runtime will use the sub-group size of 32 instead of the requested sub-group size. The reqd_sub_group_size kernel attribute is designed for platforms/architectures that support multiple sub-group sizes. Note that some SYCL code is not portable across different sub-group sizes. For example, the result of the sub-group collective reduce_over_group will depend on the sub-group size. Users that want to write code that is portable across platforms/architectures which use different sub-group sizes should either:

Write code in a portable way such that the result does not depend on sub-group size.
sub-group size sensitive parts of the code should have different versions for different platforms/architectures to take account of different sub-group sizes.

Rate this Guide

oneAPI Menu

Main Menu

Products

menu_bookGuides

Troubleshooting

Missing Devices in `sycl-ls` Output

Dealing with Invalid Binary Errors

Incorrect Platform

Correct Platform, Incorrect Device

Unresolved extern function ‘…’ / Undefined external symbol ‘…’

Sub-group size issues in codes ported across platforms/architectures

Changelog

assignmentJump to Section

Select a Product

oneAPI

Dark Mode

Light Mode

Also,

part of our network

Codeplay.com

SYCL.tech

Codeplay Developer

Codeplay Open Source

menu_bookGuides

Missing Devices in sycl-ls Output

Dealing with Invalid Binary Errors

Incorrect Platform

Correct Platform, Incorrect Device

Unresolved extern function ‘…’ / Undefined external symbol ‘…’

Sub-group size issues in codes ported across platforms/architectures

Changelog

assignmentJump to Section

Select a Product

oneAPI

Dark Mode

Light Mode

Codeplay.com

SYCL.tech

Codeplay Developer

Codeplay Open Source

Missing Devices in `sycl-ls` Output