1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118
|
//------------------------------------------------------------------------------
// GraphBLAS/CUDA/GB_cuda_upscale_identity: return identity, >= 16 bits in size
//------------------------------------------------------------------------------
// SuiteSparse:GraphBLAS, Timothy A. Davis, (c) 2017-2025, All Rights Reserved.
// SPDX-License-Identifier: Apache-2.0
//------------------------------------------------------------------------------
// CUDA atomics are not supported for 1-byte values, and are likely to be slow
// for 2-byte values. This method initializes the identity value of a monoid,
// scaling up the 1-byte and 2-byte cases to 4-bytes.
#include "GB_cuda.hpp"
void GB_cuda_upscale_identity
(
GB_void *identity_upscaled, // output: at least sizeof (uint32_t)
GrB_Monoid monoid // input: monoid to upscale
)
{
//--------------------------------------------------------------------------
// get the monoid and initialize its upscaled identity value
//--------------------------------------------------------------------------
GrB_BinaryOp op = GB_boolean_rename_op (monoid->op) ;
size_t zsize = op->ztype->size ;
memset (identity_upscaled, 0, GB_IMAX (zsize, sizeof (uint32_t))) ;
memcpy (identity_upscaled, monoid->identity, zsize) ;
if (zsize >= sizeof (uint32_t))
{
// no more work to do
return ;
}
//--------------------------------------------------------------------------
// upscale the identity value
//--------------------------------------------------------------------------
GB_Type_code zcode = op->ztype->code ;
GB_Opcode opcode = op->opcode ;
#define SET(type,id) \
{ \
type id32 = (type) (id) ; \
memcpy (identity_upscaled, &id32, sizeof (uint32_t)) ; \
return ; \
}
switch (opcode)
{
case GB_MIN_binop_code:
switch (zcode)
{
case GB_INT8_code : SET (int32_t, INT8_MAX) ;
case GB_INT16_code : SET (int32_t, INT16_MAX) ;
case GB_UINT8_code : SET (uint32_t, UINT8_MAX) ;
case GB_UINT16_code : SET (uint32_t, UINT16_MAX) ;
default: ;
}
break ;
case GB_MAX_binop_code:
switch (zcode)
{
case GB_INT8_code : SET (int32_t, INT8_MIN) ;
case GB_INT16_code : SET (int32_t, INT16_MIN) ;
// case GB_UINT8_code : SET (uint32_t, 0) ; done already
// case GB_UINT16_code : SET (uint32_t, 0) ; done already
default: ;
}
break ;
case GB_TIMES_binop_code:
switch (zcode)
{
case GB_INT8_code : SET (int32_t, 1) ;
case GB_INT16_code : SET (int32_t, 1) ;
case GB_UINT8_code : SET (uint32_t, 1) ;
case GB_UINT16_code : SET (uint32_t, 1) ;
default: ;
}
break ;
case GB_LAND_binop_code : SET (uint32_t, true) ;
case GB_EQ_binop_code : SET (uint32_t, true) ;
case GB_BAND_binop_code:
case GB_BXNOR_binop_code:
switch (zcode)
{
case GB_UINT8_code : SET (uint32_t, 0xFF) ;
case GB_UINT16_code : SET (uint32_t, 0xFFFF) ;
default: ;
}
break ;
case GB_LOR_binop_code :
case GB_LXOR_binop_code :
case GB_PLUS_binop_code :
case GB_ANY_binop_code :
case GB_BOR_binop_code :
case GB_BXOR_binop_code :
// already zero
break ;
default : ;
}
}
|