Version Latest

OpenCL on R-Car

Introduction

This document provides a guide for how to run an OpenCL executable application on the IMP CVEngine processor using ComputeAorta.

Supported Hardware and Software

This release has been tested on the following hardware using Yocto 2.23.1 and v2.1.3 of the Poky toolchain.

  • H3 StarterKit - HW version 1.1, 2.0
  • V3M Eagle - HW version 2.0
  • V3H Condor - HW version 1.0

Image support

There is no support for the OpenCL "Image" APIs on the H3, V3M and V3H hardware. This means that image API calls like clCreateImage() will return CL_INVALID_OPERATION as defined by the standard.

Board setup

In order to run OpenCL applications with ComputeAorta on R-Car H3 hardware the following components need to be on the device.

  • The ComputeAorta archive, available from our downloads section, contains the libOpenCL.so binary and OpenCL test application clVectorAddition used to validate the setup in the following sections
  • The Renesas drivers for the CVEngine must be included in the build of Yocto

Follow the hardware setup guide to find links for how to build the Yocto OS for your hardware and obtain the toolchain for cross-compilation.

OpenCL binary

ComputeAorta OpenCL is shipped as a dynamically loaded ELF shared library libOpenCL.so. This means that it is loaded by an application at runtime, rather than being statically linked into the binary. The runtime linker will try to load libOpenCL.so from standard locations on the device(/usr/lib and /lib), however if libOpenCL.so is placed in a custom directory then it won't be found. In which case at runtime exporting LD_LIBRARY_PATH to the libraries location will pick it up. Alternatively building your application with the -rpath compiler flag will hardcode a default runtime library path into your executable.

Cross compiling

Since R-Car hardware has a 64-bit ARM host processor on a custom Yocto distribution, developers needs to cross compile their OpenCL code using a specific toolchain. Depending on the version of Yocto a related version of the aarch64-poky-linux toolchain should be used, installed by default at /opt/poky. The poky toolchain is available by running an install script generated from building the Yocto SDK, see the hardware setup guide for details.

Standard gcc shared library linking options can be passed to the compiler, -lOpenCL to link against libOpenCL. As well as -L /path/to/libOpenCL so that the compiler can locate the file. A path to the Khronos OpenCL 1.2 headers also must be provided, these can be downloaded from the Khronos group GitHub repository here. However because cross compiling is being done, the --sysroot option also needs to be set to the logical root directory for standard cross compiled libraries and includes, this is located in the poky toolchain install.

To illustrate all these flags an example compilation command line is shown below.

/opt/poky/2.1.3/sysroots/x86_64-pokysdk-linux/usr/bin/aarch64-poky-linux/aarch64-poky-linux-gcc main.c \
  -I /path/to/opencl1.2/includes \
  -I /opt/poky/2.1.3/sysroots/aarch64-poky-linux/usr/include \
  -L /path/to/opencl/ \
  -lOpenCL \
  --sysroot=/opt/poky/2.1.3/sysroots/aarch64-poky-linux

Validating

If your ComputeAorta came with a pre-compiled clVectorAddition binary this can be used to verify that the drivers are configured correctly and the application is able to link to the OpenCL library.

Ensure that LD_LIBRARY_PATH is set, use the following command to to point at the directory where the OpenCL shared library is.

export LD_LIBRARY_PATH=/path/to/libOpenCL/dir

When run successfully it will output the following:

LD_LIBRARY_PATH=. ./clVectorAddition
[rcv_sample_get_localdata] rcv_thread_key is not initialized
### g_mem_base_i_virt_addr = 0xffff5c000000
Available platforms are:
  1. ComputeAorta

Selected platform 1

Running example on platform 1
Available devices are:
  1. Codeplay Software Ltd. - Renesas CV Engine

Selected device 1

Running example on device 1
 * Created context
 * Built program
 * Created buffers
 * Created kernel and set arguments
 * Created command queue
 * Enqueued writes to source buffers
 * Enqueued NDRange kernel
 * Enqueued read from destination buffer
 * Result verified
 * Released all created OpenCL objects

Example ran successfully, exiting
Sections

    Select a Product

    Please select a product

    ComputeCpp enables developers to integrate parallel computing into applications using SYCL and accelerate code on a wide range of OpenCL devices such as GPUs.

    ComputeSuite for R-Car enables developers to accelerate their applications on a wide range of Renesas R-Car based hardware such as the H3 and V3M, using widely supported open standards such as Khronos SYCL and OpenCL.

    Also,

    part of our network