cl::sycl::handler Class Reference
A handler gives user access to command group scope functionality, such as API calls. More...
#include <apis.h>
Public Member Functions | |
COMPUTECPP_TEST_VIRTUAL | ~handler () |
Destructor of the handler, implementation on the cpp file so the default_deleter can see the implementation of the internal transaction object. More... | |
template<typename T > | |
void | set_arg (int param_num, T &¶m) |
Sets an argument when using interop kernels. More... | |
template<typename... Ts> | |
void | set_args (Ts &&... args) |
Set all the given kernel args arguments for an OpenCL kernel, as if set_arg() was used with each of them in the same order and increasing index always starting at 0. More... | |
void | single_task (kernel syclKernel) |
This function effectively just launches a single thread to execute the kernel in serial asynchronously to the host execution. More... | |
template<typename nameT = std::nullptr_t, typename functorT > | |
void | single_task (const functorT &functor) |
This function effectively just launches a single thread to execute the kernel in serial asynchronously to the host execution. More... | |
template<typename nameT = std::nullptr_t, typename functorT > | |
void | single_task (kernel syclKernel, const functorT &functor) |
This function effectively just launches a single thread to execute the kernel in serial asynchronously to the host execution. More... | |
template<int dimensions> | |
void | parallel_for (const nd_range< dimensions > &ndRange, kernel syclKernel) |
Parallel_for will enqueue the precompiled kernel syclKernel to be executed a number of instances working in parallel over the number of local and global work items specified by ndRange. More... | |
template<int dimensions> | |
void | parallel_for (const range< dimensions > &range, kernel syclKernel) |
Parallel_for will enqueue the precompiled kernel syclKernel to be executed a number of instances working in parallel over the number of global work items specified by range. More... | |
template<int dimensions> | |
void | parallel_for (const range< dimensions > &range, id< dimensions > offset, kernel syclKernel) |
Parallel_for will enqueue the precompiled kernel syclKernel to be executed a number of instances working in parallel over the number of global work items specified by range. More... | |
template<typename nameT = std::nullptr_t, typename functorT , int dimensions> | |
void | parallel_for (const nd_range< dimensions > &ndRange, const functorT &functor) |
Parallel_for will enqueue the kernel functor to be executed a number of instances working in parallel over the number of local and global work items specified by ndRange. More... | |
template<typename nameT = std::nullptr_t, typename functorT , int dimensions> | |
void | parallel_for (kernel syclKernel, const nd_range< dimensions > &ndRange, const functorT &functor) |
Parallel_for will enqueue the kernel functor to be executed a number of instances working in parallel over the number of local and global work items specified by ndRange. More... | |
template<typename nameT = std::nullptr_t, typename functorT , int dimensions> | |
void | parallel_for (const range< dimensions > &range, const functorT &functor) |
Parallel_for will enqueue the kernel functor to be executed a number of instances working in parallel over the number of global work items specified by range. More... | |
template<typename nameT = std::nullptr_t, typename functorT > | |
void | parallel_for (const size_t range, const functorT &functor) |
Parallel_for will enqueue the kernel functor to be executed a number of instances working in parallel over the number of global work items specified by range. More... | |
template<typename nameT = std::nullptr_t, typename functorT , int dimensions> | |
void | parallel_for (const range< dimensions > &range, const id< dimensions > &offset, const functorT &functor) |
Parallel_for will enqueue the kernel functor to be executed a number of instances working in parallel over the number of global work items specified by range. More... | |
template<typename nameT = std::nullptr_t, typename functorT , int dimensions> | |
void | parallel_for (kernel syclKernel, const range< dimensions > &range, const functorT &functor) |
Parallel_for will enqueue the precompiled kernel syclKernel to be executed a number of instances working in parallel over the number of global work items specified by range. More... | |
template<typename nameT = std::nullptr_t, typename functorT , int dimensions> | |
void | parallel_for (kernel syclKernel, const range< dimensions > &range, const id< dimensions > &offset, const functorT &functor) |
Parallel_for will enqueue the precompiled kernel syclKernel to be executed a number of instances working in parallel over the number of global work items specified by range. More... | |
template<int dimensions> | |
void | parallel_for_work_group (const range< dimensions > &numGroups, kernel syclKernel) |
parallel_for_work_group will enqueue the precompiled kernel syclKernel to be executed a number of instances working in parallel over the number of local and global work items specified by numGroups. More... | |
template<typename nameT = std::nullptr_t, typename functorT , int dimensions> | |
void | parallel_for_work_group (kernel syclKernel, const range< dimensions > &range, const functorT &functor) |
parallel_for_work_group will enqueue the precompiled kernel syclKernel to be executed a number of instances working in parallel over the number of local and global work items specified by range. More... | |
template<typename nameT = std::nullptr_t, typename functorT , int dimensions> | |
void | parallel_for_work_group (const range< dimensions > &range, const functorT &functor) |
parallel_for_work_group will enqueue the kernel functor to be executed a number of instances working in parallel over the number of local and global work items specified by range. More... | |
template<typename nameT = std::nullptr_t, typename functorT , int dimensions> | |
void | parallel_for_work_group (const range< dimensions > &numGroups, const range< dimensions > &groupSize, const functorT &functor) |
parallel_for_work_group will enqueue the precompiled kernel syclKernel to be executed a number of instances working in parallel over the number of local and global work items specified by numGroups and groupSize. More... | |
template<typename nameT = std::nullptr_t, typename functorT , int dimensions> | |
void | parallel_for_work_group (kernel syclKernel, const range< dimensions > &numGroups, const range< dimensions > &groupSize, const functorT &functor) |
parallel_for_work_group will enqueue the precompiled kernel syclKernel to be executed a number of instances working in parallel over the number of local and global work items specified by numGroups and groupSize. More... | |
template<typename elemT , int kDims, access::mode kMode, access::target kTarget, access::placeholder IsPlaceholder = access::placeholder::true_t> | |
void | require (const accessor< elemT, kDims, kMode, kTarget, IsPlaceholder > &acc) |
Function that registers a placeholder accessor with the handler. More... | |
template<typename elemT , int kDims, access::mode kMode, access::placeholder isPlaceholder> | |
void | register_for_dma (accessor< elemT, kDims, kMode, COMPUTECPP_ACCESS_TARGET_DEVICE, isPlaceholder > &acc, size_t stride) |
Registers a global memory accessor for DMA transfer. More... | |
template<typename elemT , int kDims, access::placeholder isPlaceholder> | |
void | register_for_dma (accessor< elemT, kDims, access::mode::read, COMPUTECPP_ACCESS_TARGET_DEVICE, isPlaceholder > &acc, size_t stride) |
Registers a constant memory accessor for DMA transfer. More... | |
template<typename elemT , int kDims, access::mode kMode, access::target kTarget> | |
COMPUTECPP_DEPRECATED_API ("Deprecated Codeplay extension function: " "Bind the null accessor first, then call require()") void require(buffer< elemT | |
Function that registers a placeholder accessor with the handler and the associated storage. More... | |
void | experimental_depends_on (cl::sycl::event e) |
Register a single event that this handler should wait for before running. More... | |
void | experimental_depends_on (const std::vector< cl::sycl::event > &v) |
Register a set of events that this handler should wait for before running. More... | |
void | depends_on (cl::sycl::event e) |
Register a single event that this handler should wait for before running. More... | |
void | depends_on (const std::vector< cl::sycl::event > &v) |
Register a set of events that this handler should wait for before running. More... | |
template<typename TAcc , typename THostPtr , int dims, cl::sycl::access::mode accessMode, cl::sycl::access::target accessTarget, access::placeholder isPlaceholder, COMPUTECPP_ENABLE_IF( TAcc,(detail::can_copy_types< TAcc, THostPtr >::value && detail::is_read_mode< accessMode >::value)) > | |
void | copy (accessor< TAcc, dims, accessMode, accessTarget, isPlaceholder > acc, shared_ptr_class< THostPtr > hostPtr) |
Copies the data from the device accessor to the host pointer. More... | |
template<typename TAcc , typename THostPtr , int dims, cl::sycl::access::mode accessMode, cl::sycl::access::target accessTarget, access::placeholder isPlaceholder, COMPUTECPP_ENABLE_IF( TAcc,(detail::can_copy_types< THostPtr, TAcc >::value && detail::is_write_mode< accessMode >::value)) > | |
void | copy (shared_ptr_class< THostPtr > hostPtr, accessor< TAcc, dims, accessMode, accessTarget, isPlaceholder > acc) |
Copies the data from the host pointer to the device accessor. More... | |
template<typename TAcc , typename THostPtr , int dims, cl::sycl::access::mode accessMode, cl::sycl::access::target accessTarget, access::placeholder isPlaceholder, COMPUTECPP_ENABLE_IF( TAcc,(detail::can_copy_types< TAcc, THostPtr >::value && detail::is_read_mode< accessMode >::value)) > | |
void | copy (accessor< TAcc, dims, accessMode, accessTarget, isPlaceholder > acc, THostPtr *hostPtr) |
Copies the data from the device accessor to the host pointer. More... | |
template<typename TAcc , typename THostPtr , int dims, cl::sycl::access::mode accessMode, cl::sycl::access::target accessTarget, access::placeholder isPlaceholder, COMPUTECPP_ENABLE_IF( TAcc,(detail::can_copy_types< THostPtr, TAcc >::value && detail::is_write_mode< accessMode >::value)) > | |
void | copy (const THostPtr *hostPtr, accessor< TAcc, dims, accessMode, accessTarget, isPlaceholder > acc) |
Copies the data from the host pointer to the device accessor. More... | |
template<typename T , typename U , int dimsOrig, int dimsDest, access::mode accModeOrig, access::mode accModeDest, access::target accTargetOrig, access::target accTargetDest, access::placeholder isPlaceholderOrig, access::placeholder isPlaceholderDest, COMPUTECPP_ENABLE_IF(T,((detail::can_copy_types< T, U >::value) &&(detail::is_read_mode< accModeOrig >::value) &&(detail::is_write_mode< accModeDest >::value))) > | |
void | copy (accessor< T, dimsOrig, accModeOrig, accTargetOrig, isPlaceholderOrig > originAcc, accessor< U, dimsDest, accModeDest, accTargetDest, isPlaceholderDest > destinationAcc) |
Copies data associated with the origin accessor to the data associated with the destination accessor. More... | |
template<typename TAcc , typename T , int dims, cl::sycl::access::mode accessMode, cl::sycl::access::target accessTarget, access::placeholder isPlaceholder, COMPUTECPP_ENABLE_IF(TAcc,(detail::can_copy_types< T, TAcc >::value && detail::is_write_mode< accessMode >::value)) > | |
void | fill (accessor< TAcc, dims, accessMode, accessTarget, isPlaceholder > acc, T val) |
Fills the data associated with the accessor using the scalar value. More... | |
template<typename T , int dims, access::mode accessMode, access::target accessTarget, access::placeholder isPlaceholder> | |
void | update_host (accessor< T, dims, accessMode, accessTarget, isPlaceholder > acc) |
Update the memory object accessed by a given accessor on the host. More... | |
template<typename T > | |
void | fill (void *ptr, const T &pattern, size_t count) |
Fills the memory pointed by ptr . More... | |
void | memcpy (void *dest, const void *src, size_t size) |
Copies count bytes from src to . More... | |
void | experimental_prefetch (const void *ptr, size_t size) |
Hints to the SYCL runtime that the data is available earlier than when the USM model would require it. More... | |
void | prefetch (const void *ptr, size_t size) |
Hints to the SYCL runtime that the data is available earlier than when the USM model would require it. More... | |
Public Attributes | |
kDims & | bufObj |
kDims const accessor< elemT, kDims, kMode, kTarget, access::placeholder::true_t > & | acc |
Protected Member Functions | |
void | use_kernel_bundle_impl (const dkernelbundle_shptr execBundle) |
This command group will use device images from the given kernel bundle when invoking kernels. More... | |
void | update_device_data (const accessor_base &acc, shared_ptr_class< void > hostPtr, cl::sycl::access::mode accessMode, bool userProvidedPtr) |
Updates device data by copying to/from the device. More... | |
void | update_host_impl (const accessor_base &acc) |
Implementation for update_host. More... | |
void | copy_in_device (const accessor_base &originAcc, const accessor_base &destinationAcc) |
Copies data associated with the origin accessor to the data associated with the destination accessor. More... | |
void | fill (const accessor_base &acc, const void *patternData, const size_t patternSize) |
Fills the range of the accessor with value of hostScalarPtr[0]}. More... | |
void | fill (void *ptr, const void *patternData, size_t patternSize, size_t size) |
Fills the memory pointed by ptr . More... | |
void | interop_task_impl (const detail::interop_task_ptr &hostTaskCallable) |
Schedules a host task with an interop_handle object. More... | |
handler (const dqueue_shptr &q, const dqueue_shptr &fallbackQueue=nullptr) | |
Creates a handler for an specific queue. More... | |
context | get_context () const |
Returns the current context for the command group. More... | |
ddevice_wkptr | get_device_weak () const |
Returns the current device for the command group. More... | |
void | execute_kernel_single_task_ptr (const detail::nd_range_base &ndRange, const kernel &syclKernel, const detail::single_task_ptr &funcPtr, detail::enqueue_device_kernel_command *currentCommand) |
Creates the internal structures to execute the kernel. More... | |
void | execute_kernel_parallel_for_ptr (const detail::nd_range_base &ndRange, const kernel &syclKernel, const detail::parallel_for_ptr &funcPtr, detail::enqueue_device_kernel_command *currentCommand, int dimensions) |
void | execute_kernel_parallel_for_id_ptr (const detail::nd_range_base &ndRange, const kernel &syclKernel, const detail::parallel_for_id_ptr &funcPtr, detail::enqueue_device_kernel_command *currentCommand, int dimensions) |
void | execute_kernel_parallel_for_work_group_ptr (const detail::nd_range_base &ndRange, const kernel &syclKernel, const detail::parallel_for_work_group_ptr &funcPtr, detail::enqueue_device_kernel_command *currentCommand, int dimensions) |
void | process_functor_arguments_impl (kernel syclKernel, detail::binary_address functorBuffer, const detail::functor_arg_descriptor &argDesc, detail::enqueue_device_kernel_command *currentCommand) |
Gets the parameters from a functor and sets them as OpenCL arguments. More... | |
void | require (const accessor_base &acc) |
Internal function that registers a placeholder accessor with the handler. More... | |
void | require (const accessor_base &acc, dmem_shptr memObj, access::mode mode, access::target target) |
Internal function that registers a placeholder accessor with the handler. More... | |
void | register_for_dma (accessor_base &acc, size_t strideBytes) |
Registers an accessor for DMA transfer. More... | |
detail::enqueue_device_kernel_command * | create_kernel_command_group (kernel &syclKernel) |
Creates a kernel command group Internal implementation. More... | |
detail::index_array | get_optimal_workgroup_size (const kernel &syclKernel) |
Gets the optimal workgroup size for the current device and the given kernel. More... | |
bool | has_kernel_bundle () const |
Returns true if use_kernel_bundle(bundle) has been called with the handler. More... | |
program | get_kernel_bundle_program () const |
Get the program to use from a kernel bundle. More... | |
Protected Attributes | |
dtrans_uptr | m_trans |
Internal transaction associated with the handler. More... | |
dqueue_shptr | m_queue |
Queue to which this handler was submitted to. More... | |
dqueue_shptr | m_fallbackQueue |
Pointer to the fallback queue (if any) More... | |
vector_class< add_param_func_t > | m_paramVec |
unsigned | m_numKernels |
Number of kernels in the command group. More... | |
Detailed Description
A handler gives user access to command group scope functionality, such as API calls.
This simplifies the interface, as the command group class is not required anymore and the scope is explicit for accessors and API entries.
It is also used by accessors to get the current command group scope. Handlers can only be constructed from within queues. For the time being, the deprecated command group function also can create handlers.
The templated-side of the API entries is defined here. Some API entries are explicitly deleted to shield users from weird template errors caused by enable_if macros. In particular, if there is a pointer to a kernel instead of a kernel, the template deduction fails and causes a massive template error. However, using the deleted API entry the user sees an explicit error because they are using a non-valid interface.
Constructor & Destructor Documentation
◆ ~handler()
COMPUTECPP_TEST_VIRTUAL cl::sycl::handler::~handler | ( | ) |
Destructor of the handler, implementation on the cpp file so the default_deleter can see the implementation of the internal transaction object.
◆ handler()
|
explicitprotected |
Creates a handler for an specific queue.
Member Function Documentation
◆ COMPUTECPP_DEPRECATED_API()
cl::sycl::handler::COMPUTECPP_DEPRECATED_API | ( | "Deprecated Codeplay extension function: " "Bind the null accessor | first, |
then call require()" | |||
) |
Function that registers a placeholder accessor with the handler and the associated storage.
Defined in Codeplay Extension CP004. Will fail if accessor already associated with storage.
- Parameters
-
buf Buffer object acc Placeholder accessor
- Deprecated:
- Bind the null accessor first, then call require()
◆ copy() [1/5]
|
inline |
Copies the data from the device accessor to the host pointer.
hostPtr must have enough space allocated to match the size of the accessor data.
The underlying type of the accessor and the host pointer must match
- Accessor type can be const
- At least one type is allowed to be void
- Template Parameters
-
TAcc Underlying type of the data associated with the accessor THostPtr Underlying type of the host pointer data dims Number of dimensions of the accessor accessMode Access mode of the accessor accessTarget Access target of the accessor isPlaceholder Whether the accessor is a placeholder COMPUTECPP_ENABLE_IF The function is only valid when the access mode includes read access
- Parameters
-
acc Accessor that is used to access the buffer or image hostPtr Host pointer that will be updated
◆ copy() [2/5]
|
inline |
Copies the data from the host pointer to the device accessor.
hostPtr must have enough space allocated to match the size of the accessor data.
The underlying type of the host pointer and the accessor must match
- Host pointer type can be const
- At least one type is allowed to be void
- Template Parameters
-
THostPtr Underlying type of the host pointer data TAcc Underlying type of the data associated with the accessor dims Number of dimensions of the accessor accessMode Access mode of the accessor accessTarget Access target of the accessor isPlaceholder Whether the accessor is a placeholder COMPUTECPP_ENABLE_IF The function is only valid when the access mode includes write access
- Parameters
-
hostPtr Host pointer that points to the new data acc Accessor that is used to access the buffer or image
◆ copy() [3/5]
|
inline |
Copies the data from the device accessor to the host pointer.
hostPtr must have enough space allocated to match the size of the accessor data.
The underlying type of the accessor and the host pointer must match
- Accessor type can be const
- At least one type is allowed to be void
- Template Parameters
-
TAcc Underlying type of the data associated with the accessor THostPtr Underlying type of the host pointer data dims Number of dimensions of the accessor accessMode Access mode of the accessor accessTarget Access target of the accessor isPlaceholder Whether the accessor is a placeholder COMPUTECPP_ENABLE_IF The function is only valid when the access mode includes read access
- Parameters
-
acc Accessor that is used to access the buffer or image hostPtr Host pointer that will be updated
◆ copy() [4/5]
|
inline |
Copies the data from the host pointer to the device accessor.
hostPtr must have enough space allocated to match the size of the accessor data.
The underlying type of the host pointer and the accessor must match
- Host pointer type can be const
- At least one type is allowed to be void
- Template Parameters
-
THostPtr Underlying type of the host pointer data TAcc Underlying type of the data associated with the accessor dims Number of dimensions of the accessor accessMode Access mode of the accessor accessTarget Access target of the accessor isPlaceholder Whether the accessor is a placeholder COMPUTECPP_ENABLE_IF The function is only valid when the access mode includes write access
- Parameters
-
hostPtr Host pointer that points to the new data acc Accessor that is used to access the buffer or image
◆ copy() [5/5]
|
inline |
Copies data associated with the origin accessor to the data associated with the destination accessor.
There are a few restrictions on the accessors:
- The underlying type and number of dimensions must match
- Origin type can be const
- The origin accessor access mode must include read access
- The destination accessor access mode must include write access
- The size of the destination accessor data must be enough to hold the data from the origin accessor
- Template Parameters
-
T Underlying type of the data associated with the origin accessor U Underlying type of the data associated with the destination accessors dimsOrig Number of dimensions of the origin accessor dimsDest Number of dimensions of the destination accessor accModeOrig Access mode of the origin accessor accModeDest Access mode of the destination accessor accTargetOrig Access target of the origin accessor accTargetDest Access target of the destination accessor isPlaceholder Whether the origin accessor is a placeholder isPlaceholder Whether the destination accessor is a placeholder COMPUTECPP_ENABLE_IF Checks that the accessor types conform to the restrictions.
- Parameters
-
originAcc Accessor with the data that will be copied from destinationAcc Accessor with the data that will be copied to
◆ copy_in_device()
|
protected |
Copies data associated with the origin accessor to the data associated with the destination accessor.
- Parameters
-
originAcc Accessor with the data that will be copied from destinationAcc Accessor with the data that will be copied to
◆ create_kernel_command_group()
|
protected |
Creates a kernel command group Internal implementation.
- Returns
- Pointer to the newly created command
◆ depends_on() [1/2]
|
inline |
◆ depends_on() [2/2]
|
inline |
◆ execute_kernel_parallel_for_id_ptr()
|
protected |
◆ execute_kernel_parallel_for_ptr()
|
protected |
◆ execute_kernel_parallel_for_work_group_ptr()
|
protected |
◆ execute_kernel_single_task_ptr()
|
protected |
Creates the internal structures to execute the kernel.
This function is explicitly instantiated on the cpp file for those FunctorPtr types supported.
- Parameters
-
nd_range_base The given nd range syclKernel The SYCL device kernel FunctorPtr The std::function pointer to the user functor currentCommand The command currently being executing
◆ experimental_depends_on() [1/2]
void cl::sycl::handler::experimental_depends_on | ( | cl::sycl::event | e | ) |
Register a single event that this handler should wait for before running.
- Parameters
-
e The event that the handler should wait for before running.
◆ experimental_depends_on() [2/2]
void cl::sycl::handler::experimental_depends_on | ( | const std::vector< cl::sycl::event > & | v | ) |
Register a set of events that this handler should wait for before running.
- Parameters
-
v a vector of events.
◆ experimental_prefetch()
void cl::sycl::handler::experimental_prefetch | ( | const void * | ptr, |
size_t | size | ||
) |
Hints to the SYCL runtime that the data is available earlier than when the USM model would require it.
Can only be overlapped with kernel execution when Concurrent or System USM is available.
- Parameters
-
ptr Pointer to the memory to be prefetched to the device size Number of bytes requested to be prefetched
◆ fill() [1/4]
|
inline |
Fills the data associated with the accessor using the scalar value.
Special case of copy from host to device where the origin is a scalar value that will be replicated across the range of the accessor.
- Template Parameters
-
TAcc Underlying type of the data associated with the accessor T Underlying type of the host scalar dims Number of dimensions of the accessor accessMode Access mode of the accessor accessTarget Access target of the accessor isPlaceholder Whether the accessor is a placeholder COMPUTECPP_ENABLE_IF The function is only valid when the access mode includes read access
- Parameters
-
acc Accessor with the data that will be filled val Scalar used to fill the device data with
◆ fill() [2/4]
|
inline |
◆ fill() [3/4]
|
protected |
Fills the range of the accessor with value of hostScalarPtr[0]}.
- Parameters
-
acc Accessor with the data that will be filled patternData Host data used to fill the device data with patternSize Size of the host data
◆ fill() [4/4]
|
protected |
Fills the memory pointed by ptr
.
- Parameters
-
ptr Pointer object to fill. patternData Pointer to the memory that contains the pattern to use when filling ptr
.patternSize The size in bytes of the pattern. size The number of bytes of ptr
to fill withpatternData
.
◆ get_context()
|
protected |
Returns the current context for the command group.
◆ get_device_weak()
|
protected |
Returns the current device for the command group.
◆ get_kernel_bundle_program()
|
protected |
Get the program to use from a kernel bundle.
has_kernel_bundle() == true is a precondition.
- Returns
- A program.
◆ get_optimal_workgroup_size()
|
protected |
Gets the optimal workgroup size for the current device and the given kernel.
Internal implementation.
- Parameters
-
syclKernel Kernel to which we need to compute the workgroup size.
◆ has_kernel_bundle()
|
protected |
Returns true if use_kernel_bundle(bundle) has been called with the handler.
◆ interop_task_impl()
|
protected |
Schedules a host task with an interop_handle object.
- Parameters
-
hostTaskCallable Callable object that will be executed
◆ memcpy()
void cl::sycl::handler::memcpy | ( | void * | dest, |
const void * | src, | ||
size_t | size | ||
) |
Copies count
bytes from src
to .
- Parameters
-
dest Pointer to the memory location to copy to. src Pointer to the memory location to copy from. size The number of bytes to copy.
◆ parallel_for() [1/10]
|
inline |
Parallel_for will enqueue the precompiled kernel syclKernel to be executed a number of instances working in parallel over the number of local and global work items specified by ndRange.
- Template Parameters
-
dimensions Number of dimensions of the kernel
- Parameters
-
ndRange Dimensions of the global and local work groups syclKernel The precompiled kernel to be enqueued
◆ parallel_for() [2/10]
|
inline |
Parallel_for will enqueue the precompiled kernel syclKernel to be executed a number of instances working in parallel over the number of global work items specified by range.
- Template Parameters
-
dimensions Number of dimensions of the kernel
- Parameters
-
range Dimensions of the global work group syclKernel The precompiled kernel to be enqueued
◆ parallel_for() [3/10]
|
inline |
Parallel_for will enqueue the precompiled kernel syclKernel to be executed a number of instances working in parallel over the number of global work items specified by range.
- Template Parameters
-
dimensions Number of dimensions of the kernel
- Parameters
-
range Dimensions of the global work group offset The offset into the data being executed syclKernel The precompiled kernel to be enqueued
◆ parallel_for() [4/10]
|
inline |
Parallel_for will enqueue the kernel functor to be executed a number of instances working in parallel over the number of local and global work items specified by ndRange.
- Template Parameters
-
nameT The name of the kernel being enqueued functorT This is the type of the kernel. It will be automatically deduced by the compiler dimensions Number of dimensions of the kernel
- Parameters
-
ndRange Dimensions of the global and local work groups functor The kernel being enqueued
◆ parallel_for() [5/10]
|
inline |
Parallel_for will enqueue the kernel functor to be executed a number of instances working in parallel over the number of local and global work items specified by ndRange.
- Template Parameters
-
nameT The name of the kernel being enqueued functorT This is the type of the kernel. It will be automatically deduced by the compiler dimensions Number of dimensions of the kernel
- Parameters
-
syclKernel The precompiled kernel to be enqueued ndRange Dimensions of the global and local work groups functor The kernel being enqueued
◆ parallel_for() [6/10]
|
inline |
Parallel_for will enqueue the kernel functor to be executed a number of instances working in parallel over the number of global work items specified by range.
- Template Parameters
-
nameT The name of the kernel being enqueued functorT This is the type of the kernel. It will be automatically deduced by the compiler dimensions Number of dimensions of the kernel
- Parameters
-
range Dimensions of the global work group functor The kernel being enqueued
◆ parallel_for() [7/10]
|
inline |
Parallel_for will enqueue the kernel functor to be executed a number of instances working in parallel over the number of global work items specified by range.
- Template Parameters
-
nameT The name of the kernel being enqueued functorT This is the type of the kernel. It will be automatically deduced by the compiler
- Parameters
-
range Size of the global work group functor The kernel being enqueued
◆ parallel_for() [8/10]
|
inline |
Parallel_for will enqueue the kernel functor to be executed a number of instances working in parallel over the number of global work items specified by range.
- Template Parameters
-
nameT The name of the kernel being enqueued functorT This is the type of the kernel. It will be automatically deduced by the compiler dimensions Number of dimensions of the kernel
- Parameters
-
range Dimensions of the global work group offset The offset into the data being executed functor The kernel being enqueued
◆ parallel_for() [9/10]
|
inline |
Parallel_for will enqueue the precompiled kernel syclKernel to be executed a number of instances working in parallel over the number of global work items specified by range.
- Template Parameters
-
nameT The name of the kernel being enqueued functorT This is the type of the kernel. It will be automatically deduced by the compiler dimensions Number of dimensions of the kernel
- Parameters
-
syclKernel The precompiled kernel which is being run range Dimensions of the global work group functor The kernel being enqueued
◆ parallel_for() [10/10]
|
inline |
Parallel_for will enqueue the precompiled kernel syclKernel to be executed a number of instances working in parallel over the number of global work items specified by range.
- Template Parameters
-
nameT The name of the kernel being enqueued functorT This is the type of the kernel. It will be automatically deduced by the compiler dimensions Number of dimensions of the kernel
- Parameters
-
syclKernel The precompiled kernel which is being run range Dimensions of the global work group offset The offset into the data being executed functor The kernel being enqueued
◆ parallel_for_work_group() [1/5]
|
inline |
parallel_for_work_group will enqueue the precompiled kernel syclKernel to be executed a number of instances working in parallel over the number of local and global work items specified by numGroups.
- Template Parameters
-
dimensions Number of dimensions of the kernel
- Parameters
-
numGroups Dimensions of the global and local work groups syclKernel The precompiled kernel which is being run
◆ parallel_for_work_group() [2/5]
|
inline |
parallel_for_work_group will enqueue the precompiled kernel syclKernel to be executed a number of instances working in parallel over the number of local and global work items specified by range.
- Template Parameters
-
nameT The name of the kernel being enqueued functorT This is the type of the kernel. It will be automatically deduced by the compiler dimensions Number of dimensions of the kernel
- Parameters
-
syclKernel The precompiled kernel which is being run range Dimensions of the global work groups functor The kernel being enqueued
◆ parallel_for_work_group() [3/5]
|
inline |
parallel_for_work_group will enqueue the kernel functor to be executed a number of instances working in parallel over the number of local and global work items specified by range.
- Template Parameters
-
nameT The name of the kernel being enqueued functorT This is the type of the kernel. It will be automatically deduced by the compiler dimensions Number of dimensions of the kernel
- Parameters
-
range Dimensions of the global and local work groups functor The kernel being enqueued
◆ parallel_for_work_group() [4/5]
|
inline |
parallel_for_work_group will enqueue the precompiled kernel syclKernel to be executed a number of instances working in parallel over the number of local and global work items specified by numGroups and groupSize.
- Template Parameters
-
functorT This is the type of the kernel. It will be automatically deduced by the compiler dimensions Number of dimensions of the kernel
- Parameters
-
syclKernel The precompiled kernel which is being run numGroups dimensions of the work groups being launched groupSize each work group will launch work-items of dimension of groupSize
- Template Parameters
-
nameT The name of the kernel being enqueued
- Parameters
-
functor The kernel being enqueued
◆ parallel_for_work_group() [5/5]
|
inline |
parallel_for_work_group will enqueue the precompiled kernel syclKernel to be executed a number of instances working in parallel over the number of local and global work items specified by numGroups and groupSize.
- Template Parameters
-
nameT The name of the kernel being enqueued functorT This is the type of the kernel. It will be automatically deduced by the compiler dimensions Number of dimensions of the kernel
- Parameters
-
syclKernel The precompiled kernel which is being run numGroups dimensions of the work groups being launched groupSize each work group will launch work-items of dimension of groupSize functor The kernel being enqueued
◆ prefetch()
|
inline |
Hints to the SYCL runtime that the data is available earlier than when the USM model would require it.
Can only be overlapped with kernel execution when Concurrent or System USM is available.
- Parameters
-
ptr Pointer to the memory to be prefetched to the device size Number of bytes requested to be prefetched
◆ process_functor_arguments_impl()
|
protected |
Gets the parameters from a functor and sets them as OpenCL arguments.
Internal implementation.
- Parameters
-
syclKernel Kernel to which the functor is associated functorBuffer Functor buffer casted down to a binary array SI Kernel struct iterator SE Kernel struct iterator OI Kernel struct iterator
◆ register_for_dma() [1/3]
|
inline |
◆ register_for_dma() [2/3]
|
inline |
◆ register_for_dma() [3/3]
|
protected |
Registers an accessor for DMA transfer.
- Parameters
-
acc Accessor to use in a DMA transfer stride DMA transfer stride, in bytes
◆ require() [1/3]
◆ require() [2/3]
|
protected |
Internal function that registers a placeholder accessor with the handler.
- Parameters
-
acc Placeholder accessor
◆ require() [3/3]
|
protected |
Internal function that registers a placeholder accessor with the handler.
- Parameters
-
acc Placeholder accessor
- Deprecated:
- Bind the null accessor first, then call require()
◆ set_arg()
|
inline |
◆ set_args()
|
inline |
Set all the given kernel args arguments for an OpenCL kernel, as if set_arg() was used with each of them in the same order and increasing index always starting at 0.
- Template Parameters
-
Ts Types of the parameters passed to the OpenCL kernel
- Parameters
-
args Parameters passed to the OpenCL kernel
◆ single_task() [1/3]
void cl::sycl::handler::single_task | ( | kernel | syclKernel | ) |
This function effectively just launches a single thread to execute the kernel in serial asynchronously to the host execution.
This function takes in a precompiled kernel syclKernel previously created using build_with_kernel_type or compile_with_kernel_type
- Parameters
-
syclKernel The precompiled kernel to be enqueued
◆ single_task() [2/3]
|
inline |
This function effectively just launches a single thread to execute the kernel in serial asynchronously to the host execution.
- Template Parameters
-
nameT The name of the kernel being enqueued functorT This is the type of the kernel. It will be automatically deduced by the compiler
- Parameters
-
functor The kernel being enqueued
◆ single_task() [3/3]
|
inline |
This function effectively just launches a single thread to execute the kernel in serial asynchronously to the host execution.
This function takes in a precompiled kernel syclKernel previously created using build_with_kernel_type or compile_with_kernel_type
- Template Parameters
-
nameT The name of the kernel being enqueued
- Parameters
-
syclKernel The precompiled kernel to be enqueued
- Template Parameters
-
functorT This is the type of the kernel. It will be automatically deduced by the compiler
- Parameters
-
functor The kernel being enqueued
◆ update_device_data()
|
protected |
Updates device data by copying to/from the device.
- Parameters
-
acc Accessor that is used to access the buffer or image hostPtr Pointer that points to data on the host accessMode Operation indicator (read -> copy_from_device, write -> copy_to_device) userProvidedPtr Indicated whether the host pointer was provided by the user or whether to use the internal one
◆ update_host()
|
inline |
Update the memory object accessed by a given accessor on the host.
The host copy of the memory object will be up to date once this has executed.
- Template Parameters
-
T The accessor element type. dims The accessor's dimensionality accessMode The access mode of the accessor accessTarget The access target. isPlaceholder Is the accessor a placeholder?
- Parameters
-
acc An accessor for the memory object to update.
- Note
- This does not yet automatically update the host pointer given to the initial buffer. For this to be updated, construct the buffer using the property property::buffer::use_host_ptr.
◆ update_host_impl()
|
protected |
Implementation for update_host.
- Parameters
-
acc An accessor for the memory object to update.
- Note
- This does not yet automatically update the host pointer given to the initial buffer. For this to be updated, construct the buffer using the property property::buffer::use_host_ptr.
◆ use_kernel_bundle_impl()
|
protected |
This command group will use device images from the given kernel bundle when invoking kernels.
- Parameters
-
execBundle The bundle to use kernels from.
Member Data Documentation
◆ acc
◆ bufObj
◆ m_fallbackQueue
|
protected |
◆ m_numKernels
|
protected |
◆ m_paramVec
|
protected |
◆ m_queue
|
protected |
◆ m_trans
|
protected |
The documentation for this class was generated from the following file: