Version Latest

SYCL on R-Car

Introduction

This document provides a guide on how to build and run a SYCL application on the IMP-X5 and IMP-X5+ processors using ComputeCpp.

More documentation on how to develop using SYCL is available on our developer website.

Some good places to start are: What is SYCL? SYCL coding fundamentals * Anatomy of a SYCL application

Supported Hardware and Software

This release has been tested on the following hardware using Yocto 2.23.1 and v2.1.3 of the Poky toolchain.

  • H3 StarterKit - HW version 1.1, 2.0
  • V3M Eagle - HW version 2.0
  • V3H Condor - HW version 1.0

Setting up the Board

It is presumed that the device is set up with Yocto as outlined in the hardware setup guide.

In addition, you should copy the contents of the release package from this website onto your hardware: ComputeCpp folder - the same binaries can be used for all three hardware variations H3/V3M/V3H folder - there are separate OpenCL binaries for each hardware variation, copy the binaries for your hardware

You will also need to copy the OpenCL headers onto your device. Get the OpenCL headers from the Khronos GitHub respository and put these on the device. You will need to specify the location of these during the compilation process outlined below.

LD_LIBRARY_PATH, an environment variable, needs to be set up on the hardware to ensure it knows where to look for the OpenCL and SYCL binaries when executing applications.

export LD_LIBRARY_PATH=/path/to/computecpp/lib:/path/to/computeaorta/lib

Validate your setup

The "computecpp_info" tool will output any hardware that can be used by ComputeCpp, and in this case it will confirm if the R-Car hardware is correctly set up.

In the "ComputeCpp" folder on your device execute "computecpp_info" which can be found in the "bin" folder.

export LD_LIBRARY_PATH=/path/to/computecpp/lib:/path/to/computeaorta/lib
./ComputeCpp/bin/computecpp_info

The output should look like this:

********************************************************************************
ComputeCpp Info (CE 1.0.1)
********************************************************************************
Toolchain information:

GLIBC version: 2.23
GLIBCXX: 20160609
This version of libstdc++ is supported.
********************************************************************************
[rcv_sample_get_localdata] rcv_thread_key is not initialized
Device Info:
Discovered 1 devices matching:
  platform    : <any>
  device type : <any>
--------------------------------------------------------------------------------
Device 0:
  Device is supported                     : UNTESTED - Untested OS
  CL_DEVICE_NAME                          : Codeplay Software Ltd. - Renesas CV Engine
  CL_DEVICE_VENDOR                        : Codeplay Software Ltd.
  CL_DRIVER_VERSION                       : 1.22
  CL_DEVICE_TYPE                          : CL_DEVICE_TYPE_ACCELERATOR
If you encounter problems when using any of these OpenCL devices, please consult
this website for known issues:
https://computecpp.codeplay.com/releases/v0.9.0/platform-support-notes
********************************************************************************

If the output does not show the "Renesas CV Engine" device there is most likely a problem with your OS build, so re-visit the hardware setup guide.

Download the ComputeCpp SDK.

Download the ComputeCpp SDK to get the SYCL samples. The SDK can be found on GitHub at: https://github.com/codeplaysoftware/computecpp-sdk

Clone the SDK Repo and create a build folder using the following steps.

Note: This version of the R-Car release works with the v1.0.1 tagged codebase of the SDK. The "scan" and "reduction" samples currently fail to execute on V3M & V3H hardware. * The "image" and "gaussian" samples will not run due to the lack of OpenCL image API support.

git clone https://github.com/codeplaysoftware/computecpp-sdk.git
cd computecpp-sdk
git checkout tags/v1.0.1
mkdir build
cd build

Compiling SDK Natively

It is possible to compile the SDK for the host platform on the hardware. Simply run the following cmake command on the device, replacing the arguments with your own paths.


cmake -DComputeCpp_DIR=/path/to/computecpp/root/dir \
  -DOpenCL_INCLUDE_DIR=/path/to/opencl/include/directory \
  -DCOMPUTECPP_USER_FLAGS="-no-serial-memop;-target;aarch64-poky-linux;-DLOCAL_SIZE_M=1;-DLOCAL_SIZE_N=32" \
  -DOpenCL_LIBRARY=/path/to/libOpenCL.so.1.2/file \
  -DCMAKE_CXX_FLAGS="-DLOCAL_SIZE_M=1 -DLOCAL_SIZE_N=32" \
  -DCOMPUTECPP_BITCODE=spir ..

Now build the "hello-world" sample using the command:

make hello-world

The "hello-world" sample can be executed using the command:

./samples/hello-world/hello-world

You can build all the samples using the following command:

make

Cross-compiling the SDK

It is possible to compile the ComputeCpp SDK through cross-compilation, for example, you can compile code on an Ubuntu x86 machine and deploy the binaries to the target device.

Cross compilation is done using the poky cross compilation toolchain. To acquire the poky cross compilation toolchain the Yocto SDK must be built, there are links to instructions on the hardware setup guide on how to do this.

The toolchain install script generated should be called poky-glibc-x86_64-core-image-minimal-aarch64-toolchain-2.1.3.sh and can be found in the "Yocto/build/build/tmp/deploy/sdk" directory of the built Yocto SDK.

Run this script to install the poky 2.1.3 cross compilation toolchain, called toolchain.cmake.

./poky-glibc-x86_64-core-image-minimal-aarch64-toolchain-2.1.3.sh

Once the poky toolchain is installed, before using the cmake command below it is necessary to set the following environment variable so that it points to the directory containing the "x86_64-pokysdk-linux" and "aarch64-poky-linux" folders generated by the poky installation. By default this is in the following folder.

export SDK_POKY_ROOT=/opt/poky/2.1.2/sysroots/

When cross-compiling you need to specify a series of extra cmake parameters. The variables you may have to set are shown in the following example command. Underneath the example commands are explanations of what should be passed to the command arguments.

cmake -DComputeCpp_DIR=/path/to/computecpp/folder \
  -DOpenCL_LIBRARY=/path/to/libOpenCL.so.1.2/file \
  -DOpenCL_INCLUDE_DIR=/path/to/OpenCL/header/directory \
  -DCOMPUTECPP_RUNTIME_LIBRARY=/path/to/libComputeCpp.so/file \
  -DCOMPUTECPP_RUNTIME_LIBRARY_DEBUG=/path/to/libComputeCpp.so/file \
  -DCMAKE_TOOLCHAIN_FILE=../computecpp-sdk/cmake/toolchains/arm-gcc.cmake \
  -DCOMPUTECPP_USER_FLAGS="-DLOCAL_SIZE_M=1 -DLOCAL_SIZE_N=32 -DLOCAL_SIZE_M=1 -DLOCAL_SIZE_N=32" \
  -DCMAKE_CXX_FLAGS="-DLOCAL_SIZE_M=1 -DLOCAL_SIZE_N=32" \
  -DCOMPUTECPP_BITCODE=spir ..
  • CMAKE_BUILD_TYPE : either Release or Debug
  • ComputeCpp_DIR : The root directory of the ComputeCpp package. This should be the one native to the platform you are building on.
  • OpenCL_LIBRARY : The ComputeAorta libOpenCL.so
  • OpenCL_INCLUDE_DIR : The location of the OpenCL headers
  • COMPUTECPP_RUNTIME_LIBRARY : The libComputeCpp.so which is located in the lib directory of the ComputeCpp package. This should be from the ARM package.
  • CMAKE_TOOLCHAIN_FILE : The ComputeCpp samples project includes a cmake file used for cross-compilation, this sets CMake compiler variables used by the project including setting it to use the installed poky toolchain.
  • COMPUTECPP_BITCODE : The R-Car hardware are 32-bit devices, you need to add this flag to the cmake command to output 32-bit spir. If this value is not specified it will output by default the spir type matching the host system. This is usually spir64.
  • COMPUTECPP_USER_FLAGS : This is used to specify parameters required for the "tiled convolution" sample to execute optimally.
  • CMAKE_CXX_FLAGS : This is used to specify parameters required for the "tiled convolution" sample to execute optimally.

Now it is possible to build the "hello-world" sample using the command

make hello-world

The "hello-world" sample can be executed using the command

./samples/hello-world/hello-world

You can build all the samples using the following command:

make

Running the Samples on Device

After you have built your samples copy the binary files onto the device and execute them on your device.

Optional Extras

For extra debugging with ComputeCpp you can set COMPUTECPP_CONFIGURATION_FILE . This allows you to set variables that ComputeCpp will pick up. Adding verbose_output=true to the file will allow the tests to output which device they are running against. This is useful for making it clear whether you are running on ARM or CVE. To use the file simply create a plain text file and point to is using COMPUTECPP_CONFIGURATION_FILE as an environment variable.

export COMPUTECPP_CONFIGURATION_FILE=/path/to/config-file
Sections

    Select a Product

    Please select a product

    ComputeCpp enables developers to integrate parallel computing into applications using SYCL and accelerate code on a wide range of OpenCL devices such as GPUs.

    ComputeSuite for R-Car enables developers to accelerate their applications on a wide range of Renesas R-Car based hardware such as the H3 and V3M, using widely supported open standards such as Khronos SYCL and OpenCL.

    Also,

    part of our network