Profiling ComputeCpp

Profiling ComputeCpp Applications

Objective of this document

ComputeCpp Professional Edition has built-in support to output a JSON file that contains profiling information of the runtime and the underlying OpenCL implementation. This JSON file is conformant with the Chrome Trace Events format and can be loaded in any Chrome installation by accessing the URL chrome://tracing.

The objective of this document is to describe how to activate profiling for a ComputeCpp application using the configuration file.

Enabling the JSON profiler backend

ComputeCpp has the concept of a configuration file. A file that contains a list of options that can be used to configure how the runtime will perform certain operations.

In order to enable profiling of the runtime, create a file and add the option enable_profiling = true. Point an environment variable called COMPUTECPP_CONFIGURATION_FILE to this file and run the application. The following is a single command to create a file called sycl_config.txt and export the environment variable to enable profiling:

  • Windows:
echo "enable_profiling = true" > sycl_config.txt; set COMPUTECPP_CONFIGURATION_FILE=sycl_config.txt
  • Linux:
echo "enable_profiling = true" > sycl_config.txt; export COMPUTECPP_CONFIGURATION_FILE=sycl_config.txt

Note that if COMPUTECPP_CONFIGURATION_FILE points to a file that doesn't exists, the runtime will fail to initialize.

Profiling output

By default, when the application finishes, the runtime will write the JSON file in the current working directory, usually the same directory as the binary of the application, in the format [executable_name]_[current_date].json.

This behaviour can be changed by setting the environment variable COMPUTECPP_PROFILING_OUTPUT. If this is set, the runtime will use the value of this variable as the output file. The file doesn't need to exist but the application must have permissions to create and write it.

Performance Counters for Intel GPUs

When running ComputeCpp applications in Intel GPU's that match one of the following architectures, the profiler can display performance counters for the ComputeBasic metrics set.

  • Intel(R) Processors with Gen11 graphics devices (formerly Icelake),
  • Intel(R) Processors with Gen9 graphics devices (formerly Skylake, Kaby Lake, Apollo Lake/Broxton, Gemini Lake, Coffee Lake),
  • Intel(R) Processors with Gen8 graphics devices (formerly Broadwell),
  • Intel(R) Processors with Gen7.5 graphics devices (formerly Haswell).

No extra configuration is required, as long as the system supports Intel's Metrics Discovery API.

Example

The following is an example of the Google Chrome Visualizer that displays a execution of BabelStream in an Intel GPU.

Note at the bottom the performances counters. Many more counters are available and they can be accessed by scrolling the window.

Chrome JSON profiler Visualizer

Sections

    Select a Product

    Please select a product

    ComputeCpp enables developers to integrate parallel computing into applications using SYCL and accelerate code on a wide range of OpenCL devices such as GPUs.

    ComputeSuite for R-Car enables developers to accelerate their applications on a wide range of Renesas R-Car based hardware such as the H3 and V3M, using widely supported open standards such as Khronos SYCL and OpenCL.

    Also,

    part of our network