1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77
|
TODO (Mar 2024):
GB_CUDA_FLAGS: get from GB_config.h, allow user to modify them, etc
and pass to nvcc. Also GB_CUDA_COMPILER (default to nvcc)
complex data types
set/get cuda archictures
CUDA PreJIT kernels
GB_cuda_matrix_advise: write it
dot3: allow iso
use a stream pool (from RMM)
can rmm_wrap be thread safe?
set/get which GPU(s) to use
handling nvcc compiler errors
handle all CUDA errors
Tim:
reduce calls GB_enumify_reduce twice
--------------------------------------------------------------------------------
all the FIXMEs
clean up comments and code style
hexadecimal
stream pool
test complex
reduce: do any monoid
terminal condition?
ANY monoid in mxm
full test suite
when to use the GPU? which GPU? See the new GxB_Context object
rmm_init
--------------------------------------------------------------------------------
future:
cuda: needs source directory
(1) environment var set. If so, use it.
(3) not found so no cuda jit
all of GrB_mxm?
(1) DOT
dot2: C<#> = A'*B C is bitmap or full
dot3: C<M>=A'B C is sparse empty, M is sparse/hyper
dot4: C += A'B C is full (+ same as semiring monoid)
(2) colscale C = A*D
(3) rowscale C = D*B
(4) SAXPY
saxpy3 C<#> = A*B C is sparse or hyper
saxpy4 C += A*B C is full, A hyper or sparse
B is full or bitmap
saxpy5 C += A*B C is full, B hyper or sparse
A is full or bitmap
bitmap C<#> = A*B C bitmap
NO bitmap C<#> += A*B C bitmap
NO outer C<#> = AB' C sparse? full? bitmap?
GrB_select?
|