This guide starts with an introduction to the SYCL programming model and to performance on GPUs in general. Then it introduces the basics of performance analysis on GPU and the common types of tools used for it. And finally gives some information on vendor specific GPUs and available tooling. We also recommend reading the free Data Parallel C++ book. Chapter 15 is dedicated to performance on GPUs within the context of SYCL and DPC++.
For a list of common SYCL optimizations for both Nvidia and AMD GPUs, refer to:
For specific information when targeting Nvidia GPUs, refer to:
If you are also interested in performance optimization specific to Intel GPUs then refer to the corresponding Intel GPU specific performance guide.