1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89
|
/* ************************************************************************
* Copyright (C) 2018-2022 Advanced Micro Devices, Inc. All rights reserved.
*
* Permission is hereby granted, free of charge, to any person obtaining a copy
* of this software and associated documentation files (the "Software"), to deal
* in the Software without restriction, including without limitation the rights
* to use, copy, modify, merge, publish, distribute, sublicense, and/or sell cop-
* ies of the Software, and to permit persons to whom the Software is furnished
* to do so, subject to the following conditions:
*
* The above copyright notice and this permission notice shall be included in all
* copies or substantial portions of the Software.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IM-
* PLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS
* FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR
* COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER
* IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNE-
* CTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
*
* ************************************************************************ */
#pragma once
#include "rocblas.h"
#include <cmath>
#include <hip/hip_runtime.h>
#include <type_traits>
/* ============================================================================================ */
// Helper function to truncate float to bfloat16
inline __host__ rocblas_bfloat16 float_to_bfloat16_truncate(float val)
{
union
{
float fp32;
uint32_t int32;
} u = {val};
rocblas_bfloat16 ret;
ret.data = uint16_t(u.int32 >> 16);
if((u.int32 & 0x7fff0000) == 0x7f800000 && u.int32 & 0xffff)
ret.data |= 1; // Preserve signaling NaN
return ret;
}
/* ============================================================================================ */
/*! \brief negate a value */
template <class T>
inline T negate(T x)
{
return -x;
}
template <>
inline rocblas_half negate(rocblas_half arg)
{
union
{
rocblas_half fp;
uint16_t data;
} x = {arg};
x.data ^= 0x8000;
return x.fp;
}
template <>
inline rocblas_bfloat16 negate(rocblas_bfloat16 x)
{
x.data ^= 0x8000;
return x;
}
/* ============================================================================================ */
// Conjugate a value. For most types, simply return argument; for
// rocblas_float_complex and rocblas_double_complex, return std::conj(z)
template <typename T, std::enable_if_t<!rocblas_is_complex<T>, int> = 0>
__host__ inline T conjugate(const T& z)
{
return z;
}
template <typename T, std::enable_if_t<rocblas_is_complex<T>, int> = 0>
__host__ inline T conjugate(const T& z)
{
return std::conj(z);
}
|