Limitations

Limitations

After reading all the previous sections you should have a strong foundation in how SYCL can be used to accelerate your code. Now is a good time to talk about the limitations imposed on kernel code - unfortunately not every feature of C++ can be used. This is primarily because of hardware limitations - accelerators differ in their architecture and do not support the same kinds of machine instructions as CPUs.

The SYCL specification lists the following features as unavailable in kernels:

  • run time type information (RTTI)

  • exceptions

  • recursion

  • virtual function calls

Fortunately we can work around some of these restrictions in the majority of cases. Instead of exceptions, you can use C-style error return values. Tail-recursive functions can be turned into loops with tail call elimination. General recursion might not be possible, though. Finally, as long as all the child classes are known, virtual function calls can be replaced with a type enumeration:

Pseudo-virtual call using type enumeration

  #include <iostream>

  enum class type_t {
    CHILD_A,
    CHILD_B,
  };

  struct Base {
    type_t type;

    Base(type_t type)
      : type(type) {}
  };

  struct ChildA : public Base {
    ChildA()
      : Base(type_t::CHILD_A) {}

    void pseudo_virtual() {
      std::cout << "A calling." << std::endl;
    }
  };

  struct ChildB : public Base {
    ChildB()
      : Base(type_t::CHILD_B) {}

    void pseudo_virtual() {
      std::cout << "B calling." << std::endl;
    }
  };

  void call_pv(Base* base) {
    if (base->type == type_t::CHILD_A) {
      static_cast<ChildA*>(base)->pseudo_virtual();
    } else {
      static_cast<ChildB*>(base)->pseudo_virtual();
    }
  }

Of course, this is not always the case - we might have an unkown type hierarchy. Unfortunately, at the moment all device code has to be contained within a single translation unit. In practice, this means that kernel code can only be split into headers - not separately compiled .cpp files.

Sections

    Select a Product

    Please select a product

    ComputeCpp enables developers to integrate parallel computing into applications using SYCL and accelerate code on a wide range of OpenCL devices such as GPUs.

    ComputeSuite for R-Car enables developers to accelerate their applications on a wide range of Renesas R-Car based hardware such as the H3 and V3M, using widely supported open standards such as Khronos SYCL and OpenCL.

    Also,

    part of our network