1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336
|
.. Copyright (c) Serge Guelton and Johan Mabille
Copyright (c) QuantStack
Distributed under the terms of the BSD 3-Clause License.
The full license is in the file LICENSE, distributed with this software.
Changelog
=========
13.2.0
------
* Added broadcast overload for bool
* Fixed kernel::store for booleans
* Explicitly verify dependency between architectures (like sse2 implies sse2)
* Use default arch alignment as default alignment for xsimd::aligned_allocator
* sse2 version of xsimd::swizzle on [u]int16_t
* avx implementation of transpose for [u]int[8|16]
* Implement [u]int8 and [u]int16 matrix transpose for 128 bit registers
* Fix minor warning
* Fix fma4 support
13.1.0
------
* Fix rotate_left and rotate_right behavior (it was swapped!)
* Fix compress implementation on RISC-V
* Improve RISC-V CI
* Fix clang-17 compilation on RISC-V
* Validate cmake integration
* Provide xsimd::transpose on 64 and 32 bits on most platforms
* Improve documentation
* Provide xsimd::batch_bool::count
* Fix interaction between xsimd::make_sized_batch_t and
xsimd::batch<std::complex, ...>
* Fix vbmi, sve and rvv detection through xsimd::available_architectures
* Fix compilation on MS targets where ``small`` can be defined.
* Change default install directory for installed headers.
* Support mixed-complex implementations of xsimd::pow()
* Improve xsimd::pow implementation for complex numbers
* Fix uninitialized read in lgamma implementation
13.0.0
------
* Most xsimd functions are flagged as always_inline
* Fix some xsimd scalar version (abs, bitofsign, signbit, bitwise_cast, exp10)
* Move from batch_constant<batch<T, A>, Csts...> to batch_constant<T, A, Csts...>
* Move from batch_bool_constant<batch<T, A>, Csts...> to batch_bool_constant<T, A, Csts...>
* Provide an as_batch() method (resp. as_batch_bool) method for batch_constant (resp. batch_bool_constant)
* New architecture emulated<N> for batches of N bits emulated using scalar operations.
* Remove the version method from all architectures
* Support xsimd::avg and xsimd::avgr vector operation
* Model i8mm arm extension
* Fix dispatching mechanism
12.1.1
------
* Update readme with a section on adoption, and a section on the history of the project
* Fix/avx512vnni implementation
* Fix regression on XSIMD_NO_SUPPORTED_ARCHITECTURE
12.1.0
------
* Fix various problems with architecture version handling
* Specialize xsimd::compress for riscv
* Provide stubs for various avx512xx architectures
12.0.0
------
* Fix sincos implementation to cope with Emscripten
* Upgraded minimal version of cmake to remove deprecation warning
* Fixed constants::signmask for GCC when using ffast-math
* Add RISC-V Vector support
* Generic, simple implementation fox xsimd::compress
* Disable batch of bools, and suggest using batch_bool instead
* Add an option to skip installation
11.2.0
------
* Provide shuffle operations of floating point batches
* Provide a generic implementation of xsimd::swizzle with dynamic indices
* Implement rotl, rotr, rotate_left and rotate_right
* Let CMake figure out pkgconfig directories
* Add missing boolean operators in xsimd_api.hpp
* Initial Implementation for the new WASM based instruction set
* Provide a generic version for float to uint32_t conversion
11.1.0
------
* Introduce XSIMD_DEFAULT_ARCH to force default architecture (if any)
* Remove C++ requirement on xsimd::exp10 scalar implementation
* Improve and test documentation
11.0.0
------
* Provide a generic reducer
* Fix ``find_package(xsimd)`` for xtl enabled xsimd, reloaded
* Cleanup benchmark code
* Provide avx512f implementation of FMA and variant
* Hexadecimal floating points are not a C++11 feature
* back to slow implementation of exp10 on Windows
* Changed bitwise_cast API
* Provide generic signed /unsigned type conversion
* Fixed sde location
* Feature/incr decr
* Cleanup documentation
10.0.0
------
* Fix potential ABI issue in SVE support
* Disable fast exp10 on OSX
* Assert on unaligned memory when calling aligned load/store
* Fix warning about uninitialized storage
* Always forward arch parameter
* Do not specialize the behavior of ``simd_return_type`` for char
* Support broadcasting of complex batches
* Make xsimd compatible with -fno-exceptions
* Provide and test comparison operators overloads that accept scalars
9.0.1
-----
* Fix potential ABI issue in SVE support, making ``xsimd::sve`` a type alias to
size-dependent type.
9.0.0
-----
* Support fixed size SVE
* Fix a bug in SSSE3 ``xsimd::swizzle`` implementation for ``int8`` and ``int16``
* Rename ``xsimd::hadd`` into ``xsimd::reduce_add``, provide ``xsimd::reduce_min`` and ``xsimd::reduce_max``
* Properly report unsupported double for neon on arm32
* Fill holes in xsimd scalar api
* Fix ``find_package(xsimd)`` for xtl enabled xsimd
* Replace ``xsimd::bool_cast`` by ``xsimd::batch_bool_cast``
* Native ``xsimd::hadd`` for float on arm64
* Properly static_assert when trying to instantiate an ``xsimd::batch`` of xtl complex
* Introduce ``xsimd::batch_bool::mask()`` and ``batch_bool::from_mask(...)``
* Flag some function with ``[[nodiscard]]``
* Accept both relative and absolute libdir and include dir in xsimd.pc
* Implement ``xsimd::nearbyint_as_int`` for NEON
* Add ``xsimd::polar``
* Speedup double -> F32/I32 gathers
* Add ``xsimd::slide_left`` and ``xsimd::slide_right``
* Support integral ``xsimd::swizzles`` on AVX
8.1.0
-----
* Add ``xsimd::gather`` and ``xsimd::scatter``
* Add ``xsimd::nearbyint_as_int``
* Add ``xsimd::none``
* Add ``xsimd::reciprocal``
* Remove batch constructor from memory adress, use ``xsimd::batch<...>::load_(un)aligned`` instead
* Leave to msvc users the opportunity to manually disable FMA3 on AVX
* Provide ``xsimd::insert`` to modify a single value from a vector
* Make ``xsimd::pow`` implementation resilient to ``FE_INVALID``
* Reciprocal square root support through ``xsimd::rsqrt``
* NEON: Improve ``xsimd::any`` and ``xsimd::all``
* Provide type utility to explicitly require a batch of given size and type
* Implement ``xsimd::swizzle`` on x86, neon and neon64
* Avx support for ``xsimd::zip_lo`` and ``xsimd::zip_hi``
* Only use ``_mm256_unpacklo_epi<N>`` on AVX2
* Provide neon/neon64 conversion function from ``uint(32|64)_t`` to ``(float|double)``
* Provide SSE/AVX/AVX2 conversion function from ``uint32_t`` to ``float``
* Provide AVX2 conversion function from ``(u)int64_t`` to ``double``
* Provide better SSE conversion function from ``uint64_t`` to ``double``
* Provide better SSE conversion function to ``double``
* Support logical xor for ``xsimd::batch_bool``
* Clarify fma support:
- FMA3 + SSE -> ``xsimd::fma3<sse4_2>``
- FMA3 + AVX -> ``xsimd::fma3<avx>``
- FMA3 + AVX2 -> ``xsimd::fma3<avx2>``
- FMA4 -> ``xsimd::fma4``
* Allow ``xsimd::transform`` to work with complex types
* Add missing scalar version of ``xsimd::norm`` and ``xsimd::conj``
8.0.5
-----
* Fix neon ``xsimd::hadd`` implementation
* Detect unsupported architectures and set ``XSIMD_NO_SUPPORTED_ARCHITECTURE``
if needs be
8.0.4
-----
* Provide some conversion operators for ``float`` -> ``uint32``
* Improve code generated for AVX2 signed integer comparisons
* Enable detection of avx512cd and avx512dq, and fix avx512bw detection
* Enable detection of AVX2+FMA
* Pick the best compatible architecture in ``xsimd::dispatch``
* Enables support for FMA when AVX2 is detected on Windows
* Add missing includes / forward declaration
* Mark all functions inline and noexcept
* Assert when using incomplete ``std::initializer_list``
8.0.3
-----
* Improve CI & testing, no functional change
8.0.2
-----
* Do not use ``_mm256_srai_epi32`` under AVX, it's an AVX2 instruction
8.0.1
-----
* Fix invalid constexpr ``std::make_tuple`` usage in neon64
|