There is no kernel debugging support for OpenCL applications on CVengine in the current version of ComputeAorta for R-Car. Debugging support for OpenCL kernels will be added in future releases.
We distinguish three mechanisms to profile OpenCL applications:
- Manual profiling using event profiling information on profiling queues
- Using CodeXL OpenCL profiling
- Using OpenCL trace intercept tools
Any of the above can be used effectively on the CVengine implementation. We focus on the first two. At the time of writing, there was no implementation available of an OpenCL trace intercept tool built for an ARM toolchain. However, once available, it should work without modification.
OpenCL event objects can be used to obtain profiling information that measures the execution time of a command. To capture profiling information for OpenCL commands associated with events, we need to create a command-queue with
CL_QUEUE_PROFILING_ENABLE flag as properties argument to
The profiling operations are available for memory objects and kernels. Details of OpenCL commands that return events with profiling information can be found in the Section 5.12 of the OpenCL 1.2 specification.
The example below illustrates the usage of OpenCL profiling events to extract the execution time of a kernel enqueued to a device.
// Declaration of an event to obtain performance information cl_event perf_event; // Store of the start and end timers cl_ulong start = 0, end = 0; // Create an OpenCL queue with the profiling information enabled cl_command_queue cvEngineQueue = clCreateCommandQueue(context, CL_QUEUE_PROFILING_ENABLE, NULL); // Enqueue a kernel as usual. The execution of a kernel can return a perf_event clEnqueueNDRangeKernel(cvEngineQueue,…, &perf_event); // Wait for the kernel execution event clWaitForEvents(1, &perf_event); // Obtain the timestamp of the start and end stages of the event clGetEventProfilingInfo(perf_event, CL_PROFILING_COMMAND_START, sizeof(cl_ulong), &start, NULL); clGetEventProfilingInfo(perf_event, CL_PROFILING_COMMAND_END, sizeof(cl_ulong), &end, NULL); // End and Start are device-specific timestamps. Subtraction of the two returns the execution // time from start to end. Resolution is nanoseconds. We multiply by 1e-06 to obtain milliseconds. cl_double kernelExecutionTime = (cl_double)(end - start)*(cl_double)(1e-06);
function is used to obtain the timestamps for the event. Timestamps for the
time it was enqueued
CL_PROFILING_COMMAND_QUEUE and command submission
CL_PROFILING_COMMAND_SUBMIT are also available.
Codeplay, in the context of the LPGPU2 project, has extended the AMD CodeXL profiler to support non-AMD devices. Using the remote profiling capabilities of CodeXL, it is possible to execute and retrieve the execution trace of an application and visualize it on the graphical tool. To launch the tracing, first execute the remote agent on the board:
and then execute CodeXL on your desktop PC.
From the CodeXL GUI tool bar menu :
Select File/New Project (ctrl + N).
In the new window choose the Remote Host radio button.
Set the host name or the IP address as Remote Host Address.
In this screen it is also possible to set the execution path to the location one the board where the OpenCL application binary is available.
Selecting Profile and then Application Time-trace enables the OpenCL tracing view.
Hitting the green arrow (Play) will trigger the execution of the binary in the board.
The remote agent will collect all the information and send it to the desktop client. Finally, the desktop client will visualize all the information in the display. Details on how to use CodeXL or on the CodeXL port from Codeplay can be found in the LPGPU2 blog post announcement. At the time of writing, the profiler can only be obtained directly from Codeplay support.
The image below shows an example OpenCL profiling session using CodeXL to profile an OpenCL application on the R-Car V3M CVengine. The coloured boxed represent the different OpenCL operations and its duration. The detailed list of OpenCl operations can be seen in the box with the list. Instructions to use CodeXL are available on their website.