Product

ComputeCpp™ Community Edition

  • home
    Home
  • menu_book
    Guides
  • auto_stories
    Reference
  • cloud_download
    Download
  • rate_review
    Feedback
  • question_answer
    Support
  • lock_open
  • help_outline
  • search
  • menu
  • ComputeCpp™ Menu

    • Home
    • Guides
    • Reference
    • Download
    • Feedback
    • Support
  • Main Menu

    • Home
    • Products

      • ComputeCpp CE
      • ComputeCpp PE
      • ComputeSuite for R-Car CE
      • ComputeSuite for R-Car PE
    • Login
    • Cookie Policy
    • Contact Us

menu_bookGuides

history2.3.0
  • 2.11.0
  • 2.10.0
  • 2.9.0
  • 2.8.0
  • 2.7.0
  • 2.6.0
  • 2.5.0
  • 2.4.0
  • 2.3.0
  • 2.2.1
  • 2.1.0
  • 2.0.0
  • 1.3.0
  • 1.2.0
  • 1.1.6
  • 1.1.5
  • 1.1.4
  • 1.1.3
  • 1.1.2
  • 1.1.1
  • 1.1.0
  1. arrow_forward
    Getting Started
    1. arrow_forward
      Interactive SYCL Tutorial
      1. arrow_forward
        Release Notes
        1. subdirectory_arrow_right
          Platform Support
          1. arrow_forward
            Hardware Support Tool
            1. arrow_forward
              Targeting ARM
              1. arrow_forward
                Targeting Windows
                1. arrow_forward
                  Targeting NVIDIA PTX
                2. arrow_forward
                  What is SYCL
                  1. subdirectory_arrow_right
                    SYCL Guide
                    1. arrow_forward
                      Hello SYCL
                      1. arrow_forward
                        Error Handling
                        1. arrow_forward
                          Template Functions
                          1. arrow_forward
                            Parallelism
                            1. arrow_forward
                              Memory
                              1. arrow_forward
                                Limitations
                                1. subdirectory_arrow_right
                                  Debugging
                                  1. arrow_forward
                                    Debugging on a SYCL Host Device
                                    1. arrow_forward
                                      Debugging on an OpenCL Device
                                    2. arrow_forward
                                      Multiple Kernels
                                    3. arrow_forward
                                      ComputeCpp Professional Edition
                                      1. arrow_forward
                                        Anatomy of a ComputeCpp App
                                        1. arrow_forward
                                          Frameworks for ComputeCpp
                                          1. arrow_forward
                                            Sample Code
                                            1. arrow_forward
                                              compute++ Compiler Manual
                                              1. arrow_forward
                                                Error Codes
                                                1. arrow_forward
                                                  Logging
                                                  1. arrow_forward
                                                    Configuration file
                                                    1. subdirectory_arrow_right
                                                      ComputeCpp Profiler
                                                      1. arrow_forward
                                                        Step by Step Profiler Guide
                                                        1. arrow_forward
                                                          Using the Tracy Profiler with ComputeCpp
                                                        2. arrow_forward
                                                          Integration Guide
                                                          1. subdirectory_arrow_right
                                                            SYCL for CUDA Developers
                                                            1. arrow_forward
                                                              Introduction
                                                              1. arrow_forward
                                                                Execution Model
                                                                1. arrow_forward
                                                                  Memory Model
                                                                  1. arrow_forward
                                                                    Migration
                                                                    1. arrow_forward
                                                                      Libraries
                                                                      1. arrow_forward
                                                                        Examples
                                                                      2. subdirectory_arrow_right
                                                                        TensorFlow Guide
                                                                        1. arrow_forward
                                                                          TensorFlow Native Compilation
                                                                          1. arrow_forward
                                                                            TensorFlow Cross Compilation
                                                                          2. subdirectory_arrow_right
                                                                            Eigen Guide
                                                                            1. arrow_forward
                                                                              Options for Building Eigen
                                                                            2. arrow_forward
                                                                              CE_LICENSE
                                                                              info
                                                                              Please note that you are viewing a guide targeting an older version of ComputeCpp™ Community Edition. This guide was designed for version 2.3.0.

                                                                              Using the Tracy Profiler with ComputeCpp

                                                                              link

                                                                              Using the Tracy Profiler with ComputeCpp

                                                                              When developing performance-sensitive applications, it is important to understand where are the critical parts of the code that can affect the performance. Good profiling support is paramount for any application aiming to be more efficient at solving a problem in constrained environments. Efficiency is context-dependant. It could mean lowering the power consumption of a battery in an embedded device or getting peak performance from the hardware in a supercomputer.

                                                                              In the context of SYCL applications, there are a lot of things that can affect performance. How well is the application written? How well does the compiler understand your code? Am I using the right compiler flags? Could I be doing more work in parallel? Why is this kernel taking so much time to execute? What is my application doing while this kernel is running?

                                                                              These are some of the questions that you might want to answer when developing your SYCL application. To help you answer these questions, we are adding native support for the Tracy Profiler to ComputeCpp Professional Edition.

                                                                              Tracy-UI

                                                                              This is a screenshot of Tracy showing details of a profiling session for the NBody demo available in the ComputeCpp SDK.

                                                                              link

                                                                              Tracy Profiler

                                                                              Tracy is a real-time, nanosecond resolution, remote telemetry, hybrid frame and sampling profiler for games and other applications. It is an open-source profiler that supports CPU (C, C++, Lua), GPU (OpenGL, Vulkan, OpenCL, Direct3D 12), memory locks, context-switches and more.

                                                                              By adding native support for the Tracy profiler in ComputeCpp, you can connect Tracy to your application by simply enabling a configuration option. When connected, your application will immediately start sending data to Tracy, forming a nanosecond resolution execution profile that can be analyzed, searched and inspected.

                                                                              Tracy can handle large amounts of data and it only requires RAM to be available in the machine running the server. Being designed as a client-server application, Tracy can be used to analyze remote applications, making it suitable to be used with embedded devices and development boards.

                                                                              Tracy-Diagram

                                                                              link

                                                                              Enabling Tracy Profiler in ComputeCpp

                                                                              In scenarios where remote profiling is not an option, due to network restrictions or lack of connectivity, it is still possible to use the ComputeCpp JSON profiler. After running the application, you can load the file in Google Chrome, the new Microsoft Edge browser or even Tracy itself as it supports importing files in JSON format.

                                                                              The profilers in ComputeCpp are not mutually exclusive, this means you can have both a real-time capture with Tracy and a JSON file at the end of your application execution.

                                                                              To enable Tracy, just add enable_tracy_profiling = true in your configuration file. Note that profiling is disabled by default, so you may need to add enable_profiling = true in the configuration file as well. When enabling profiling support, ComputeCpp automatically activates the JSON profiler, to turn it off, you can use enable_json_profiling = false.

                                                                              Here is an example of a configuration file that will enable Tracy and disable the JSON output:

                                                                              enable_profiling = true
                                                                              enable_tracy_profiling = true
                                                                              enable_json_profiling = false
                                                                              

                                                                              The ComputeCpp integration with Tracy also supports the display of performance counter data. To enable performance counters, add enable_perf_counters_profiling = true to your configuration file.

                                                                              link

                                                                              Profiling your application with Tracy

                                                                              You will now find a binary called Tracy.exe on Windows or Tracy on Linux when you download ComputeCpp Professional Edition. This binary is the Tracy server that is guaranteed to be compatible with the ComputeCpp in the package. Tracy uses a custom communication protocol, so the protocol version used in ComputeCpp must match the protocol version in the server application. For this reason, you must use the Tracy server included as part of the ComputeCpp PE release package to avoid any compatibility issues.

                                                                              link

                                                                              Example of a Profiling Session with Tracy

                                                                              To demonstrate the capabilities of the Tracy integration with ComputeCpp, we selected the NBody simulation from the ComputeCpp SDK (see the screenshot below). This application is a good example because it launches many kernels every second and doesn't finish until it is interrupted.