-
Aksel Alpay authored
[PCUDA] Initial support for AdaptiveCpp portable CUDA: A HIP/CUDA dialect for the generic JIT compiler (#1737) * [SSCP] Store number of original kernel parameters in HCF * Fix free kernel generation * Start implementing runtime * Avoid getAccessType for compatibility * Finish wiring up kernel call * Some support for host SLM * Get kernel launch to work. More or less. * Make cuda chevron launch optional feature * Add missing files * Fix pcudaFree API * Improve compatibility of memory management API * [PCUDA] Avoid duplicates in HCF format for host side parameter sizes * Fix aggregate type handling for PCUDA kernel arguments * [PCUDA] Add documentation * [PCUDA] Add some LIT tests * [PCUDA] Improve handling of switching between platforms/backends * [PCUDA] Implement more runtime API functionality * [PCUDA] Implement event and StreamWaitEvent API * Fix CI issues * Enable PCUDA lit tests * Fix chevron-launch LIT test * [PCUDA] Add slm/dlm support * Default to assume kernel dim=3 in case the hipsycl_kernel_dim annotation could not be found * [SSCP][OpenMP] Consistently handle internal local memory subsections for globals * [SSCP][llvm-to-host][PCUDA] Unfold constant expressions when replacing local mem GVs with builtin call * [PCUDA] Update docs with slm/dlm support * [PCUDA] Avoid llvm::SmalleDenseMap::contains() for compatibility * [PCUDA] Remove trailing backslash in __global__ attribute * Avoid race condition in prefer-group-size extension test * [OpenMP] Capture dag node in submit_kernel to avoid lifetime issue * [PCUDA] ExternDynamicLocalMemoryPass: Fix AS cast for LLVM < 17 * [CI] Enable PCUDA LIT tests for OpenMP * [PCUDA] Fix free kernel argument handling for multi-parameter kernels * [PCUDA][llvm-to-host] Fix ConstExpr unfolding for static local memory support * [PCUDA] Add SLM LIT test case * [PCUDA] Add DLM LIT test * [PCUDA][NFC] Minor doc updates * [PCUDA] Also perform constant expr unfolding for ExternDynamicLocalMemoryPass * [PCUDA][llvm-to-spirv] Rewrite GEPs to zero-sized arrays to avoid llvm-spirv limitation * Add some diagnostic output * Remove diagnostics again * [PCUDA] Add test cases for runtime API * Add pcudaSetDeviceExt() * [PCUDA][CI] Enable PCUDA API tests * [PCUDA] Use GNU attribute syntax instead of [[clang::...]] for compatibility * [PCUDA] Add pcudaGetDeviceProperties * [PCUDA] Add pcudaGetErrorName/String * [PCUDA] Add support for vector types * [PCUDA] acpp: Improve handling of .cu files * [PCUDA] C math functions to account for CUDA-flavored math stdlib behavior * [PCUDA] Also provide cuda.h header * [PCUDA] SLM: Look through ConstantExpr arguments when scanning for local memory annotation * [PCUDA] Add pcudaThreadSynchronize() * Fix missing synchronization in in-order queue test case * [NFC][doc] Fix formatting * Add PCUDA to mkdocs * [PCUDA] Add pcudaHostAlloc * [PCUDA] StreamDestroy: Fix typo in compatibility header * [PCUDA] Add DriverGetVersion, PeekAtLastError, FreeHost * [PCUDA] Fix copy-paste error in mapping of __pcudaGridDim * [PCUDA] Fix bug in kernel launch and chevron LIT test case * [PCUDA] Fix incorrect type in vector type definitions * [PCUDA] Add Memset, MemsetAsync * [NFC][doc] Update PCUDA documentation
Aksel Alpay authored[PCUDA] Initial support for AdaptiveCpp portable CUDA: A HIP/CUDA dialect for the generic JIT compiler (#1737) * [SSCP] Store number of original kernel parameters in HCF * Fix free kernel generation * Start implementing runtime * Avoid getAccessType for compatibility * Finish wiring up kernel call * Some support for host SLM * Get kernel launch to work. More or less. * Make cuda chevron launch optional feature * Add missing files * Fix pcudaFree API * Improve compatibility of memory management API * [PCUDA] Avoid duplicates in HCF format for host side parameter sizes * Fix aggregate type handling for PCUDA kernel arguments * [PCUDA] Add documentation * [PCUDA] Add some LIT tests * [PCUDA] Improve handling of switching between platforms/backends * [PCUDA] Implement more runtime API functionality * [PCUDA] Implement event and StreamWaitEvent API * Fix CI issues * Enable PCUDA lit tests * Fix chevron-launch LIT test * [PCUDA] Add slm/dlm support * Default to assume kernel dim=3 in case the hipsycl_kernel_dim annotation could not be found * [SSCP][OpenMP] Consistently handle internal local memory subsections for globals * [SSCP][llvm-to-host][PCUDA] Unfold constant expressions when replacing local mem GVs with builtin call * [PCUDA] Update docs with slm/dlm support * [PCUDA] Avoid llvm::SmalleDenseMap::contains() for compatibility * [PCUDA] Remove trailing backslash in __global__ attribute * Avoid race condition in prefer-group-size extension test * [OpenMP] Capture dag node in submit_kernel to avoid lifetime issue * [PCUDA] ExternDynamicLocalMemoryPass: Fix AS cast for LLVM < 17 * [CI] Enable PCUDA LIT tests for OpenMP * [PCUDA] Fix free kernel argument handling for multi-parameter kernels * [PCUDA][llvm-to-host] Fix ConstExpr unfolding for static local memory support * [PCUDA] Add SLM LIT test case * [PCUDA] Add DLM LIT test * [PCUDA][NFC] Minor doc updates * [PCUDA] Also perform constant expr unfolding for ExternDynamicLocalMemoryPass * [PCUDA][llvm-to-spirv] Rewrite GEPs to zero-sized arrays to avoid llvm-spirv limitation * Add some diagnostic output * Remove diagnostics again * [PCUDA] Add test cases for runtime API * Add pcudaSetDeviceExt() * [PCUDA][CI] Enable PCUDA API tests * [PCUDA] Use GNU attribute syntax instead of [[clang::...]] for compatibility * [PCUDA] Add pcudaGetDeviceProperties * [PCUDA] Add pcudaGetErrorName/String * [PCUDA] Add support for vector types * [PCUDA] acpp: Improve handling of .cu files * [PCUDA] C math functions to account for CUDA-flavored math stdlib behavior * [PCUDA] Also provide cuda.h header * [PCUDA] SLM: Look through ConstantExpr arguments when scanning for local memory annotation * [PCUDA] Add pcudaThreadSynchronize() * Fix missing synchronization in in-order queue test case * [NFC][doc] Fix formatting * Add PCUDA to mkdocs * [PCUDA] Add pcudaHostAlloc * [PCUDA] StreamDestroy: Fix typo in compatibility header * [PCUDA] Add DriverGetVersion, PeekAtLastError, FreeHost * [PCUDA] Fix copy-paste error in mapping of __pcudaGridDim * [PCUDA] Fix bug in kernel launch and chevron LIT test case * [PCUDA] Fix incorrect type in vector type definitions * [PCUDA] Add Memset, MemsetAsync * [NFC][doc] Update PCUDA documentation
Loading