Limitations

Limitations

After reading all the previous sections you should have a strong foundation in how SYCL can be used to accelerate your code. Now is a good time to talk about the limitations imposed on kernel code - unfortunately not every feature of C++ can be used. This is primarily because of hardware limitations - accelerators differ in their architecture and do not support the same kinds of machine instructions as CPUs.

The SYCL specification lists the following features as unavailable in kernels: run time type information (RTTI) exceptions recursion
virtual function calls

Fortunately we can work around some of these restrictions in the majority of cases. Instead of exceptions, you can use C-style error return values. Tail-recursive functions can be turned into loops with tail call elimination. General recursion might not be possible, though. Finally, as long as all the child classes are known, virtual function calls can be replaced with a type enumeration:

Pseudo-virtual call using type enumeration

#include <iostream>

enum class type_t {
  CHILD_A,
  CHILD_B,
};

struct Base {
  type_t type;

  Base(type_t type)
    : type(type) {}
};

struct ChildA : public Base {
  ChildA()
    : Base(type_t::CHILD_A) {}

  void pseudo_virtual() {
    std::cout << "A calling." << std::endl;
  }
};

struct ChildB : public Base {
  ChildB()
    : Base(type_t::CHILD_B) {}

  void pseudo_virtual() {
    std::cout << "B calling." << std::endl;
  }
};

void call_pv(Base* base) {
  if (base->type == type_t::CHILD_A) {
    static_cast<ChildA*>(base)->pseudo_virtual();
  } else {
    static_cast<ChildB*>(base)->pseudo_virtual();
  }
}

Additionally, there is another technique called the CRTP idiom in C++ that allows to express polymorphism in compile-time which results at near-zero performance hit during run-time. You derive the base class from the derived class by forwarding the derived class to the base. The base class can be casted to the correct type, that includes the implementation detials, via static_cast as this is information visible by the compiler. However, this means that the entire interface of the class must be declared in the base class.

Pseudo-virtual call using CRTP (Curiously Recurring Template Pattern)

#include <iostream>

template <class Derived>
struct Base {
  void pseudo_virtual() {
    static_cast<Derived&>(*this).pseudo_virtual();
  }
};

struct ChildA : public Base<ChildA> {
  void pseudo_virtual() {
    std::cout << "A calling." << std::endl;
  }
};

struct ChildB : public Base<ChildB> {
  void pseudo_virtual() {
    std::cout << "B calling." << std::endl;
  }
};

template<class Derived>
void call_pv(Base<Derived>& base) {
  base.pseudo_virtual();
}

It is also important to note that CRTP removes the need of using pointer types. If you wish to learn more about CRTP and its uses in SYCL kernels, have a look at our blog post on Enabling Polymorphism in SYCL using CRTP.

Of course, this is not always the case - we might have an unkown type hierarchy. Unfortunately, at the moment all device code has to be contained within a single translation unit. In practice, this means that kernel code can only be split into headers - not separately compiled .cpp files.

Sections

    Select a Product

    Please select a product

    ComputeCpp enables developers to integrate parallel computing into applications using SYCL and accelerate code on a wide range of OpenCL devices such as GPUs.

    ComputeSuite for R-Car enables developers to accelerate their applications on a wide range of Renesas R-Car based hardware such as the H3 and V3M, using widely supported open standards such as Khronos SYCL and OpenCL.

    Also,

    part of our network