1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46
|
Source: golang-github-minio-blake2b-simd
Section: devel
Priority: optional
Maintainer: Debian Go Packaging Team <team+pkg-go@tracker.debian.org>
Uploaders: Andrius Merkys <merkys@debian.org>
Build-Depends: debhelper-compat (= 12),
dh-golang,
golang-any
Standards-Version: 4.5.0
Homepage: https://github.com/minio/blake2b-simd
Vcs-Browser: https://salsa.debian.org/go-team/packages/golang-github-minio-blake2b-simd
Vcs-Git: https://salsa.debian.org/go-team/packages/golang-github-minio-blake2b-simd.git
XS-Go-Import-Path: github.com/minio/blake2b-simd
Testsuite: autopkgtest-pkg-go
Package: golang-github-minio-blake2b-simd-dev
Architecture: all
Depends: ${misc:Depends}
Description: Fast hashing using pure Go implementation of BLAKE2b with SIMD instructions
BLAKE2b-SIMD Pure Go implementation of BLAKE2b using SIMD
optimizations. Introduction This package was initially based
on the pure go BLAKE2b (https://github.com/dchest/blake2b)
implementation of Dmitry Chestnykh and merged with the (cgo
dependent) AVX optimized BLAKE2 (https://github.com/codahale/blake2)
implementation (which in turn is based on the official implementation
(https://github.com/BLAKE2/BLAKE2). It does so by using Go's Assembler
(https://golang.org/doc/asm) for amd64 architectures with a golang only
fallback for other architectures.
.
In addition to AVX there is also support for AVX2 as well as SSE. Best
performance is obtained with AVX2 which gives roughly a 4X performance
increase approaching hashing speeds of 1GB/sec on a single core.
.
BLAKE2b is a hashing algorithm that operates on 64-bit integer values. The
AVX2 version uses the 256-bit wide YMM registers in order to essentially
process four operations in parallel. AVX and SSE operate on 128-bit
values simultaneously (two operations in parallel). Below are excerpts
from compressAvx2_amd64.s, compressAvx_amd64.s, and compress_generic.go
respectively.
.
VPADDQ YMM0,YMM0,YMM1 /* v0 += v4, v1 += v5, v2 += v6, v3 += v7 */
.
VPADDQ XMM0,XMM0,XMM2 /* v0 += v4, v1 += v5 */ VPADDQ
XMM1,XMM1,XMM3 /* v2 += v6, v3 += v7 */
.
v0 += v4 v1 += v5 v2 += v6 v3 += v7
|