This section contains information, tips and pointers on debugging SYCL™ applications on various devices.
The host part of a SYCL application can simply be debugged as a C++ application, but kernel debugging support or tooling may vary depending on the device being targeted.
Note
If a SYCL application is generic, it may sometimes be helpful to debug it on devices with better debugging support and tooling such as an Intel® OpenCL™ CPU device, rather than on the actual target device.
Debugging on an Intel OpenCL CPU Device
Documentation regarding debugging a DPC++ application using the Intel OpenCL CPU device can be found in the Debugging the DPC++ and OpenMP* Offload Process section of the Intel oneAPI Programming Guide.
Debugging code using DPC++ on NVIDIA® GPUs
With DPC++ and a NVIDIA GPU you can use cuda-gdb to debug kernels compiled
for the DPC++ cuda
backend. The oneapi-gdb tool, included in
the oneAPI Base Toolkit, is only capable of debugging Intel processors so can’t
be used with NVIDIA GPUs. The cuda-gdb tool is part of the CUDA® Toolkit
downloaded from NVIDIA so you will need to ensure you have this installed to
use this debugger.
Consult the cuda-gdb
documentation
for details on the correct usage of cuda-gdb
.
When compiling code for use with the cuda-gdb debugger, it is necessary to enable debug symbols by using the following flags:
icpx –G -O0 …
Note
If you encounter unexpected segmentation faults or communication errors like those seen in this post. To mitigate these issues, it is advised to try setting this environment variable when launching cuda-gdb:
CUDBG_USE_LEGACY_DEBUGGER=1
Command Line Debugging
To start the debugger from the command line use this command with your executable:
cuda-gdb ./myexecutable
The commands and documentation for using the debugger are documented on the NVIDIA website.
Note
When starting a debug session, cuda-gdb may report a security issue warning with the oneAPI python script. To remove this warning create a file in /home/.config/gdb/cuda-gdb initialize file cuda-gdbinit and insert a line to add the path to the gdb Python script in the oneAPI Base Toolkit to a safe path list:
add-auto-load-safe-path /path/to/libsycl.so.6.1.0-gdb.py
For example this might be /opt/intel/oneapi/compiler/2023.1.0/linux/lib/libsycl.so.6.1.0-gdb.py
An example of the commands to include is below.
add-auto-load-safe-path /opt/intel/oneapi/compiler/2023.1.0/linux/lib/libsycl.so.6.1.0-gdb.py
Debugging using VSCode with cuda-gdb
Note
For detailed information on how to setup VSCode to work with SYCL, you can refer to the following guide
To use the debugger with VSCode you will need to modify the VSCode debugger configuration file launch.json to use the cuda-gdb command when launching. You will also need to make sure that the compilation flags mentioned previously,
-G and –O0, are included as parameters to the icpx compiler in the
VSCode project compilation configuration file task.json to ensure it generates the required debug symbols.
With VSCode configured to launch a debug session, hit shift+ctrl+p to bring up the commands, type in Debug: Select Debug Session, select the type of debug session required from the list and hit return.
Use VSCode to attach to the example and start a debug session
It is also possible to attach the debugger to a program that is already executing. Using the VSCode terminal panel in the debug view, type this in the command line prompt:
CUDBG_USE_LEGACY_DEBUGGER=1 ./<TheDPC++Executable> &
Using shift+ctrl+p, choose Debug: Start a new session, proceed to select “C/C++: Intel icpx build attach cuda-gdb debug CUDA target ”. VSCode will provide a list of currently running processes. Choose the process identified by the executable’s name.
VSCode supports the direct entry of cuda-gdb commands through its debug console. To enter cuda-gdb commands they must be proceeded with the -exec prefix as shown in figure 1. Upon the execution of a debugger command, the VSCode GUI and code panels should update accordingly.
Figure 1: Enter cuda-gdb commands directly in the VSCode debug console.
Debugging Kernel Code
It is possible to step into your kernel cgh.parallel_for() but you will need to step-out a few times to reach the kernel code.
As you step into the kernel or over the code by the parallel_for statement, the following cuda-gdb commands can be used to determine which kernel or thread (a kernel is a thread) and which line in that kernel is currently being debugged, or find the point in the program when the kernel has just been enqueued.
When it has been determined that the host thread no longer has debugger focus and the kernel is instantiated as shown in figure 2, the debug focus can be moved to the kernel.
Figure 2: The DPC++ kernel has been enqueued but does not have debugger focus.
Use the cuda-gdb command to switch focus:
cuda kernel <id number of the kernel>
A star will appear next to the kernel listed indicating the kernel has debug focus.
On switching focus the debugger will stop at a breakpoint in one of the remaining executing kernels (another thread) . It is possible the debugger will stop at another location within the SYCL library. In this case, use the finish command to move up and out the stack frame to return the debugger back to the kernel code of interest.
You may notice the debugger jump back to the host thread at the queue wait()
point momentarily. Wait for it to return to the kernel code you are stepping
through.
In the Intel oneapi-gdb debugger, you can use the gdb
commands
set scheduler-locking on or step to lock the debugger, preventing it from
randomly jumping to other kernels still in execution. However, the cuda-gdb
debugger lacks an equivalent command to prevent this behavior.
In these cases, it is possible to use the cuda kernel id command to switch back to the kernel of interest if the debugger switches to another kernel. When using the debugger and encountering multiple barriers from different kernels (sub-groups) the debugger can enter an unresponsive state. In order to recover the state, kill the debugger and start again.
How to kill a process
If the debug session hangs or something goes wrong the program’s process may still be running and it must be stopped before starting a new debug session. Killing a process can also be performed from with the debugger itself using the command kill.
It is also possible to list the running processes by using the Linux command ps ux -u <your name>. From the list find the process identified by your program. At the command line type kill -9 <the process id number>.
Using more than one type of backend
If you are using a system where DPC++ has been setup to support multiple backends, e.g. OpenCL™ and CUDA®, it may be necessary to use a filter to direct the debug session to the appropriate backend. To do this use the environmental variable SYCL_DEVICE_FILTER when starting a debug session. From the terminal command line use a command similar to this:
SYCL_DEVICE_FILTER=cuda cuda-gdb ./myapp
ROCm™ Debugger Support
The ROCm SDK comes with a debugger rocgdb
that can be used to debug kernels
on AMD GPUs for HIP applications.
However, DPC++ is currently unable to generate appropriate debug information
for SYCL kernels targeting AMD GPUs. So, trying to debug a SYCL kernel using
rocgdb
will fail with an error such as:
Thread 5 "dbg" hit Breakpoint 1, with lanes [0-63], main::{lambda(sycl::_V1::handler&)#1}::operator()(sycl::_V1::handler&) const::{lambda(sycl::_V1::id<1>)#1}::operator()(sycl::_V1::id<1>) const (/long_pathname_so_that_rpms_can_package_the_debug_info/src/rocm-gdb/gdb/dwarf2/frame.c:1032: internal-error: Unknown CFA rule.
The debugger can still be helpful when the application is built without debug
information. For example if a kernel is throwing an error such as an invalid
memory access, it’s possible to use rocgdb
to run the program. It will
then break on the error and the kernel assembly can be inspected at the point
triggering the error using the disas
command.