After reading all the previous sections you should have a strong foundation in how SYCL can be used to accelerate your code. Now is a good time to talk about the limitations imposed on kernel code - unfortunately not every feature of C++ can be used. This is primarily because of hardware limitations - accelerators differ in their architecture and do not support the same kinds of machine instructions as CPUs.
The SYCL specification lists the following features as unavailable in kernels:
- run time type information (RTTI)
- exceptions
- recursion
- virtual function calls
Fortunately we can work around some of these restrictions in the majority of cases. Instead of exceptions, you can use C-style error return values. Tail-recursive functions can be turned into loops with tail call elimination. General recursion might not be possible, though. Finally, as long as all the child classes are known, virtual function calls can be replaced with a type enumeration:
Pseudo-virtual call using type enumeration
#include <iostream>
enum class type_t {
CHILD_A,
CHILD_B,
};
struct Base {
type_t type;
Base(type_t type)
: type(type) {}
};
struct ChildA : public Base {
ChildA()
: Base(type_t::CHILD_A) {}
void pseudo_virtual() {
std::cout << "A calling." << std::endl;
}
};
struct ChildB : public Base {
ChildB()
: Base(type_t::CHILD_B) {}
void pseudo_virtual() {
std::cout << "B calling." << std::endl;
}
};
void call_pv(Base* base) {
if (base->type == type_t::CHILD_A) {
static_cast<ChildA*>(base)->pseudo_virtual();
} else {
static_cast<ChildB*>(base)->pseudo_virtual();
}
}
Additionally, there is another technique called the CRTP idiom in C++ that allows to express polymorphism in compile-time which results at near-zero performance hit during run-time. You derive the base class from the derived class by forwarding the derived class to the base. The base class can be casted to the correct type, that includes the implementation details, via static_cast
as this is information visible by the compiler. However, this means that the entire interface of the class must be declared in the base class.
Pseudo-virtual call using CRTP (Curiously Recurring Template Pattern)
#include <iostream>
template <class Derived>
struct Base {
void pseudo_virtual() {
static_cast<Derived&>(*this).pseudo_virtual();
}
};
struct ChildA : public Base<ChildA> {
void pseudo_virtual() {
std::cout << "A calling." << std::endl;
}
};
struct ChildB : public Base<ChildB> {
void pseudo_virtual() {
std::cout << "B calling." << std::endl;
}
};
template<class Derived>
void call_pv(Base<Derived>& base) {
base.pseudo_virtual();
}
It is also important to note that CRTP removes the need of using pointer types. If you wish to learn more about CRTP and its uses in SYCL kernels, have a look at our blog post on Enabling Polymorphism in SYCL using CRTP.
Of course, this is not always the case - we might have an unknown type hierarchy. Unfortunately, at the moment all device code has to be contained within a single translation unit. In practice, this means that kernel code can only be split into headers - not separately compiled .cpp files.