2025.0.0
Improvements
Add option
-mllvm --amdgpu-oclc-unsafe-int-atomics=true
, to select between safe and unsafe atomics [34135a37]Improved compiler diagnostic for missing architecture [fc9d62f]
Implemented
sycl_ext_codeplay_enqueue_native_command
extension [0f48227]Implement
handler::prefetch
andhandler::mem_advise
as empty nodes enforcing the node dependencies in SYCL-Graph [unified-runtime 3c12bbc]Remove some overheads from UR sync-points used to implement SYCL-Graph edges [unified-runtime 3c12bbc]
Bug Fixes
Fix SYCL built-ins when using generic address spaces [9e4768c, 51ffc04]
Fix shuffles when using half floating points [b13a3c4]
Fix
multi_ptr
relational operators fornullptr
[4f91bbb]Use
hipMemcpyDefault
rather thanhipMemcpyHostToHost
forhipMemcpyKind
parameter in SYCL-Graphmemcpy
nodesurCommandBufferAppendUSMMemcpyExp
[unified-runtime 3c12bbc]
2024.2.0
The AMD plugin is now out of beta.
Improvements
Add support for ROCm 5.7, 6.0 and 6.1 (remove support for ROCm 4.5)
Added experimental support for the
sycl_ext_oneapi_graph
extension (ROCm 5.5+) [897b2707]Add memory advice support [a669374b]
Add support for the
sycl_ext_oneapi_device_global
extension [d377464c]Add support for the
sycl_ext_oneapi_peer_access
extension [unified-runtime f39d41f7]
Bug Fixes
Fix shuffles for half floating point type [b13a3c4]
Fix math built-ins when using shared USM allocations [9e4768c]
Fix and improve local work size guessing [unified-runtime 43f0963]
Add missing architectures to SYCL target shortcut syntax [c1ce1594]
Fix USM 2D copies with newer ROCm versions [unified-runtime 532dac51]
2024.1.0
Improvements
Added support for half floating point types [89875490]
Finished implementation of SYCL subgroup operations [289aeaef, 3f3df772]
Added AMD Matrix cores support for CDNA2 architecture (gfx90a) through
sycl_ext_oneapi_matrix
[31481cea]Added support for
sycl_ext_oneapi_device_architecture
[1ad69e59]Added support for
ext_oneapi_queue_priority
[0c33fea5]Improve relevance of returned error codes [66a24f7b, b7a43a42]
Implement some missing
make_
interoperability entry points [5e9d07b1]Switch to using primary HIP context [d1c92cb9]
Bug Fixes
Fix using
-mllvm -enable-global-offset=false
flag [00cf4c29]Remove extra
prefetch
calls before kernel launches [unified-runtime 841a2870]Work around
atomic_sub
bug on some systems [c8a6f0dd]Fix race condition in event profiling [e8ffd021]
Deprecation
Deprecate context interoperability, primary context should be used instead [e213fe2f]
2024.0.2
No changes
2024.0.1
No changes
2024.0
No changes
2023.2.0
Improvements
SYCL Compiler
Add gfx9+ HIP atomics [b13561c9]
Add basic HIP atomics [c3c5e923]
Disable macro
__CUDA_ARCH__
for AMD HIP platform [8a7cf2b2]
SYCL Library
Added support for
sycl_ext_oneapi_memcpy2d
- OneAPI memcpy2d on AMD backend [9008a5d2]Add support for PCI device ID and UUID [e09ff588]
Support the
SYCL_PI_HIP_MAX_LOCAL_MEM_SIZE
environment variable [92f6d688]Support
amd-gpu-gfx1034
as an acceptable value for-fsycl-targets
[5e86a41d]
Bug Fixes
Replace error on invalid work group size to
PI_ERROR_INVALID_WORK_GROUP_SIZE
[2357af0a]Address wrong results from
sycl::ctz
function [5a9f601e]Address the issue that can cause events not to be waited on as intended [1b225447], [ce7c594f]
2023.1.0
Improvements
SYCL Compiler
Implement support for AMD architectures (such as amd_gpu_gfx1032) as argument to -fsycl-targets [e5de913f]
SYCL Library
Add
cl_khr_fp64
in device extensions [cd832bff]Support zero range kernel for hip backends [a3958865]
Bug Fixes
Fix incorrectly constructed guards [ce7c594f]
Add interop header and device specialization [998fd91e]
2023.0.0
Initial beta release of oneAPI for AMD GPUs!
This release was created from the intel/llvm repository at commit 0f579ba.
New Features
Beta support for HIP backend
SYCL Compiler
Support for device side
assert
Support for local memory accessors
Support for group collective functions
Support for
sycl::ext::oneapi::sub_group::get_local_id
SYCL Library
Support for querying
atomic64
device capabilitySupport for multiple HIP streams per SYCL queue
Support for interoperability
Support for
sycl::queue::submit_barrier