info

Please note that you are viewing a guide targeting an older version of oneAPI for AMD GPU. This guide was designed for version 2024.1.0.

Install oneAPI for AMD GPUs (beta)

This guide contains information on using DPC++ to run SYCL™ applications on AMD GPUs via the DPC++ HIP plugin version 2024.1.0.

Warning

Please keep in mind that this is beta-quality software, which means that it contains most of the major features, but is not yet complete and still contains known and unknown bugs. For more details on the supported features, see the Features documentation.

For general information about DPC++, refer to the DPC++ Resources section.

Supported Platforms

link

Warning

Support for ROCm 4.5.2 is deprecated and will be removed in a future release.

This release has been tested on the following platforms:

GPU Hardware	Architecture	Operating System	HIP	GPU Driver
AMD Radeon Pro W6800	gfx1030	Ubuntu 22.04.2 LTS	5.4.3	6.1.0-1006-oem
AMD Radeon Pro W6800	gfx1030	Ubuntu 22.04.2 LTS	4.5.2	6.1.0-1006-oem

In theory this release should work with all devices that are compatible with ROCm.

This release should work on HIP 5.x versions or HIP 4.5.x, up to 5.4.x, but only HIP 5.4.3 and 4.5.2 have been tested. Codeplay cannot guarantee correct operation with other HIP releases.
- HIP 5.4.3 can be installed alongside an existing HIP installation installation. See the Multi-version ROCm Installation section of the ROCm Installation Guide.
- HIP 5.5 and above are not supported with DPC++ due to a change in the bitcode libraries being distributed by HIP, this will be fixed in the next release.
This release should work across a wide array of AMD GPUs that are supported by ROCm, but Codeplay cannot guarantee correct operation on untested platforms. AMD GPUs not officially supported by ROCm may also work.
For a full list of AMD GPUs that are officially supported by ROCm Linux, see here.
The package has been tested on Ubuntu 22.04 only, but can be installed on any Linux systems.
The “oneAPI for AMD GPUs (beta)” packages currently only support Linux.

Prerequisites

link

Install C++ development tools.

You will need the following C++ development tools installed in order to build and run oneAPI applications: cmake, gcc, g++, make and pkg-config.

The following console commands will install the above tools on the most popular Linux distributions:

Ubuntu

sudo apt update
sudo apt -y install cmake pkg-config build-essential

Red Hat and Fedora

sudo yum update
sudo yum -y install cmake pkgconfig
sudo yum groupinstall "Development Tools"

SUSE

sudo zypper update
sudo zypper --non-interactive install cmake pkg-config
sudo zypper --non-interactive install pattern devel_C_C++

Verify that the tools are installed by running:

which cmake pkg-config make gcc g++

You should see output similar to:

/usr/bin/cmake
/usr/bin/pkg-config
/usr/bin/make
/usr/bin/gcc
/usr/bin/g++

Install an Intel® oneAPI Toolkit version 2024.1.0, that contains the DPC++/C++ Compiler.
- For example, the “Intel oneAPI Base Toolkit” should suit most use cases.
- The Toolkit must be version 2024.1.0 - otherwise oneAPI for AMD GPUs cannot be installed.
Install the GPU driver and ROCm™ software stack for the AMD GPU.
- For example, for ROCm 5.4.3, follow the steps described in the Installation with install script guide.
- Using the amdgpu-install installer is recommended, with the --usecase="dkms,graphics,opencl,hip,hiplibsdk" argument to ensure that all required components are installed.

Installation

link

Download the latest oneAPI for AMD GPUs (beta) installer:
1. Directly via Website
2. Using Download API with cURL or WGET (requires an account)
Run the downloaded self-extracting installer:

sh oneapi-for-amd-gpus-2024.1.0-rocm-5.4.3-linux.sh
- The installer will search for an existing Intel oneAPI Toolkit version 2024.1.0 installation in common locations. If you have installed an Intel oneAPI Toolkit in a custom location, use --install-dir /path/to/intel/oneapi.
- If your Intel oneAPI Toolkit installation is outside your home directory, you may be required to run this command with elevated privileges, e.g. sudo.

Set Up Your Environment

link

To set up your oneAPI environment in your current session, source the Intel-provided setvars.sh script.

For system-wide installations:
```
. /opt/intel/oneapi/setvars.sh --include-intel-llvm
```
For private installations (in the default location):
```
. ~/intel/oneapi/setvars.sh --include-intel-llvm
```
- The --include-intel-llvm option is required in order to add LLVM tools such as clang++ to the path.
- Note that you will have to run this script in every new terminal session. For options to handle the setup automatically each session, see the relevant Intel oneAPI Toolkit documentation, such as Set Environment Variables for CLI Development.
Ensure that the HIP libraries and tools can be found in your environment:
1. Run rocminfo - if it runs without any obvious errors in the output then your environment should be set up correctly.
2. Otherwise, set your environment variables manually:
```
export PATH=/PATH_TO_ROCM_ROOT/bin:$PATH
export LD_LIBRARY_PATH=/PATH_TO_ROCM_ROOT/lib:$LD_LIBRARY_PATH
```
  ROCm is commonly installed in /opt/rocm-x.x.x/.

Verify Your Installation

link

To verify the DPC++ HIP plugin installation, the DPC++ sycl-ls tool can be used to make sure that SYCL now exposes the available AMD GPUs. You should see something similar to the following in the sycl-ls output if AMD GPUs are found:

[ext_oneapi_hip:gpu:0] AMD HIP BACKEND, AMD Radeon PRO W6800 0.0 [HIP 40421.43]

If the available AMD GPUs are correctly listed, then the DPC++ HIP plugin was correctly installed and set up.
Otherwise, see the “Missing devices in sycl-ls output” section of the Troubleshooting documentation.
Note that this command may also list other devices such as OpenCL™ devices, Intel GPUs, or NVIDIA® GPUs, based on the available hardware and DPC++ plugins installed.

Run a Sample Application

link

Create a file simple-sycl-app.cpp with the following C++/SYCL code:

#include <sycl/sycl.hpp>

int main() {
  // Creating buffer of 4 ints to be used inside the kernel code
  sycl::buffer<int, 1> Buffer{4};

  // Creating SYCL queue
  sycl::queue Queue{};

  // Size of index space for kernel
  sycl::range<1> NumOfWorkItems{Buffer.size()};

  // Submitting command group(work) to queue
  Queue.submit([&](sycl::handler &cgh) {
    // Getting write only access to the buffer on a device
    auto Accessor = Buffer.get_access<sycl::access::mode::write>(cgh);
    // Executing kernel
    cgh.parallel_for<class FillBuffer>(
        NumOfWorkItems, [=](sycl::id<1> WIid) {
          // Fill buffer with indexes
          Accessor[WIid] = static_cast<int>(WIid.get(0));
        });
  });

  // Getting read only access to the buffer on the host.
  // Implicit barrier waiting for queue to complete the work.
  auto HostAccessor = Buffer.get_host_access();

  // Check the results
  bool MismatchFound{false};
  for (size_t I{0}; I < Buffer.size(); ++I) {
    if (HostAccessor[I] != I) {
      std::cout << "The result is incorrect for element: " << I
                << " , expected: " << I << " , got: " << HostAccessor[I]
                << std::endl;
      MismatchFound = true;
    }
  }

  if (!MismatchFound) {
    std::cout << "The results are correct!" << std::endl;
  }

  return MismatchFound;
}

Compile the application with:

icpx -fsycl -fsycl-targets=amdgcn-amd-amdhsa \
        -Xsycl-target-backend --offload-arch=<ARCH> \
        -o simple-sycl-app simple-sycl-app.cpp

Where ARCH is the GPU architecture e.g. gfx1030, which you can check by running:

rocminfo | grep 'Name: *gfx.*'

You should see the GPU architecture in the output, for example:

  Name:                    gfx1030

Run the application with:

ONEAPI_DEVICE_SELECTOR="ext_oneapi_hip:*" SYCL_PI_TRACE=1 ./simple-sycl-app

You should see output like:

SYCL_PI_TRACE[basic]: Plugin found and successfully loaded: libpi_hip.so [ PluginVersion: 14.37.1 ]
SYCL_PI_TRACE[basic]: Plugin found and successfully loaded: libpi_unified_runtime.so [ PluginVersion: 14.37.1 ]
SYCL_PI_TRACE[all]: Requested device_type: info::device_type::automatic
SYCL_PI_TRACE[all]: Selected device: -> final score = 1500
SYCL_PI_TRACE[all]:   platform: AMD HIP BACKEND
SYCL_PI_TRACE[all]:   device: AMD Radeon PRO W6800
The results are correct!

If so, you have successfully set up and verified your oneAPI for AMD GPUs development environment, and you can begin developing oneAPI applications.

The rest of this document provides general information on compiling and running oneAPI applications on AMD GPUs.

Use DPC++ to Target AMD GPUs

link

Compile for AMD GPUs

link

To compile a SYCL application for AMD GPUs, use the icpx compiler provided with DPC++. For example:

icpx -fsycl -fsycl-targets=amdgcn-amd-amdhsa \
        -Xsycl-target-backend=amdgcn-amd-amdhsa --offload-arch=gfx1030 \
        -o sycl-app sycl-app.cpp

The following flags are required:

-fsycl: Instructs the compiler to build the C++ source file in SYCL mode. This flag will also implicitly enable C++17 and automatically link against the SYCL runtime library.
-fsycl-targets=amdgcn-amd-amdhsa: Instructs the compiler to build SYCL kernels for the AMD GPU target.
-Xsycl-target-backend=amdgcn-amd-amdhsa --offload-arch=gfx1030: Instructs the compiler to build SYCL kernels for the gfx1030 AMD GPU architecture.

Note that when targeting an AMD GPU, the specific architecture of the GPU must be provided.

For more information on available SYCL compilation flags, see the DPC++ Compiler User’s Manual or for information on all DPC++ compiler options see the Compiler Options section of the Intel oneAPI DPC++/C++ Compiler Developer Guide and Reference.

Using the `icpx` compiler

link

The icpx compiler is by default a lot more aggressive with optimizations than the regular clang++ driver, as it uses both -O2 and -ffast-math. In many cases this can lead to better performance but it can also lead to some issues for certain applications. In such cases it is possible to disable -ffast-math by using -fno-fast-math and to change the optimization level by passing a different -O flag. It is also possible to directly use the clang++ driver which can be found in $releasedir/compiler/latest/linux/bin-llvm/clang++, to get regular clang++ behavior.

Compile for Multiple Targets

link

In addition to targeting AMD GPUs, you can build SYCL applications that can be compiled once and then run on a range of hardware. The following example shows how to output a single binary including device code that can run on AMD GPUs, NVIDIA GPUs, or any device that supports SPIR e.g. Intel GPUs.

icpx -fsycl -fsycl-targets=amdgcn-amd-amdhsa,nvptx64-nvidia-cuda,spir64 \
        -Xsycl-target-backend=amdgcn-amd-amdhsa --offload-arch=gfx1030 \
        -Xsycl-target-backend=nvptx64-nvidia-cuda --offload-arch=sm_80 \
        -o sycl-app sycl-app.cpp

Run SYCL Applications on AMD GPUs

link

After compiling your SYCL application for an AMD target, you should also ensure that the correct SYCL device representing the AMD GPU is selected at runtime.

In general, simply using the default device selector should select one of the available AMD GPUs. However in some scenarios, users may want to change their SYCL application to use a more precise SYCL device selector, such as the GPU selector, or even a custom selector.

The environment variable ONEAPI_DEVICE_SELECTOR may be used to help the SYCL device selectors by restricting the set of devices that can be used. For example, to only allow devices exposed by the DPC++ HIP plugin:

export ONEAPI_DEVICE_SELECTOR="ext_oneapi_hip:*"

For more details on this environment variable, see the Environment Variables section of the oneAPI DPC++ Compiler documentation. Note: this environment variable will be deprecated in a subsequent release.

DPC++ Resources

link

SYCL Resources

link

Rate this Guide

oneAPI for AMD GPUs (beta) 2024.1.0

Debugging SYCL Applications

assignmentJump to Section

Install oneAPI for AMD GPUs (beta)
Supported Platforms
Prerequisites
Installation
Set Up Your Environment
Verify Your Installation
Run a Sample Application
Use DPC++ to Target AMD GPUs
Compile for AMD GPUs
Using the icpx compiler
Compile for Multiple Targets
Run SYCL Applications on AMD GPUs
DPC++ Resources
SYCL Resources

oneAPI Menu

Main Menu

Products

menu_bookGuides

Install oneAPI for AMD GPUs (beta)

Supported Platforms

Prerequisites

Installation

Set Up Your Environment

Verify Your Installation

Run a Sample Application

Use DPC++ to Target AMD GPUs

Compile for AMD GPUs

Using the `icpx` compiler

Compile for Multiple Targets

Run SYCL Applications on AMD GPUs

DPC++ Resources

SYCL Resources

oneAPI for AMD GPUs (beta) 2024.1.0

Debugging SYCL Applications

assignmentJump to Section

Select a Product

oneAPI

Dark Mode

Light Mode

Also,

part of our network

Codeplay.com

SYCL.tech

Codeplay Developer

Codeplay Open Source

menu_bookGuides

Supported Platforms

Prerequisites

Installation

Set Up Your Environment

Verify Your Installation

Run a Sample Application

Use DPC++ to Target AMD GPUs

Compile for AMD GPUs

Using the icpx compiler

Compile for Multiple Targets

Run SYCL Applications on AMD GPUs

DPC++ Resources

SYCL Resources

oneAPI for AMD GPUs (beta) 2024.1.0

Debugging SYCL Applications

assignmentJump to Section

Select a Product

oneAPI

Dark Mode

Light Mode

Codeplay.com

SYCL.tech

Codeplay Developer

Codeplay Open Source

Using the `icpx` compiler