1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294
|
.. image:: https://github.com/Maratyszcza/PeachPy/blob/master/logo/peachpy.png
:alt: PeachPy logo
:align: center
===========================================================================
Portable Efficient Assembly Code-generator in Higher-level Python (PeachPy)
===========================================================================
.. image:: https://img.shields.io/badge/License-BSD%202--Clause%20%22Simplified%22%20License-blue.svg
:alt: PeachPy License: Simplified BSD
:target: https://github.com/Maratyszcza/PeachPy/blob/master/LICENSE.rst
.. image:: https://travis-ci.org/Maratyszcza/PeachPy.svg?branch=master
:alt: Travis-CI Build Status
:target: https://travis-ci.org/Maratyszcza/PeachPy/
.. image:: https://ci.appveyor.com/api/projects/status/p64ew9in189bu2pl?svg=true
:alt: AppVeyor Build Status
:target: https://ci.appveyor.com/project/MaratDukhan/peachpy
PeachPy is a Python framework for writing high-performance assembly kernels.
PeachPy aims to simplify writing optimized assembly kernels while preserving all optimization opportunities of traditional assembly. Some PeachPy features:
- Universal assembly syntax for Windows, Unix, and Golang assembly.
* PeachPy can directly generate ELF, MS COFF and Mach-O object files and assembly listings for Golang toolchain
- Automatic adaption of function to different calling conventions and ABIs.
* Functions for different platforms can be generated from the same assembly source
* Supports Microsoft x64 ABI, System V x86-64 ABI (Linux, OS X, and FreeBSD), Linux x32 ABI, Native Client x86-64 SFI ABI, Golang AMD64 ABI, Golang AMD64p32 ABI
- Automatic register allocation.
* PeachPy is flexible and lets mix auto-allocated and hardcoded registers in the same code.
- Automation of routine tasks in assembly programming:
* Function prolog and epilog and generated by PeachPy
* De-duplication of data constants (e.g. `Constant.float32x4(1.0)`)
* Analysis of ISA extensions used in a function
- Supports x86-64 instructions up to AVX-512 and SHA
* Including 3dnow!+, XOP, FMA3, FMA4, TBM and BMI2.
* Excluding x87 FPU and most system instructions.
* Rigorously tested with `auto-generated tests <https://github.com/Maratyszcza/PeachPy/tree/master/test/x86_64/encoding>`_ to produce the same opcodes as binutils.
- Auto-generation of metadata files
* Makefile with module dependencies (`-MMD` and `-MF` options)
* C header for the generated functions
* Function metadata in JSON format
- Python-based metaprogramming and code-generation.
- Multiplexing of multiple instruction streams (helpful for software pipelining).
- Compatible with Python 2 and Python 3, CPython and PyPy.
Online Demo
-----------
You can try online demo on `PeachPy.IO <http://www.peachpy.io>`_
Installation
------------
PeachPy is actively developed, and thus there are presently no stable releases of 0.2 branch. We recommend that you use the `master` version:
.. code-block:: bash
pip install --upgrade git+https://github.com/Maratyszcza/PeachPy
Installation for development
****************************
If you plan to modify PeachPy, we recommend the following installation procedure:
.. code-block:: bash
git clone https://github.com/Maratyszcza/PeachPy.git
cd PeachPy
python setup.py develop
Using PeachPy as a command-line tool
------------------------------------
.. code-block:: python
# These two lines are not needed for PeachPy, but will help you get autocompletion in good code editors
from peachpy import *
from peachpy.x86_64 import *
# Lets write a function float DotProduct(const float* x, const float* y)
# If you want maximum cross-platform compatibility, arguments must have names
x = Argument(ptr(const_float_), name="x")
# If name is not specified, it is auto-detected
y = Argument(ptr(const_float_))
# Everything inside the `with` statement is function body
with Function("DotProduct", (x, y), float_,
# Enable instructions up to SSE4.2
# PeachPy will report error if you accidentially use a newer instruction
target=uarch.default + isa.sse4_2):
# Request two 64-bit general-purpose registers. No need to specify exact names.
reg_x, reg_y = GeneralPurposeRegister64(), GeneralPurposeRegister64()
# This is a cross-platform way to load arguments. PeachPy will map it to something proper later.
LOAD.ARGUMENT(reg_x, x)
LOAD.ARGUMENT(reg_y, y)
# Also request a virtual 128-bit SIMD register...
xmm_x = XMMRegister()
# ...and fill it with data
MOVAPS(xmm_x, [reg_x])
# It is fine to mix virtual and physical (xmm0-xmm15) registers in the same code
MOVAPS(xmm2, [reg_y])
# Execute dot product instruction, put result into xmm_x
DPPS(xmm_x, xmm2, 0xF1)
# This is a cross-platform way to return results. PeachPy will take care of ABI specifics.
RETURN(xmm_x)
Now you can compile this code into a binary object file that you can link into a program...
.. code-block:: bash
# Use MS-COFF format with Microsoft ABI for Windows
python -m peachpy.x86_64 -mabi=ms -mimage-format=ms-coff -o example.obj example.py
# Use Mach-O format with SysV ABI for OS X
python -m peachpy.x86_64 -mabi=sysv -mimage-format=mach-o -o example.o example.py
# Use ELF format with SysV ABI for Linux x86-64
python -m peachpy.x86_64 -mabi=sysv -mimage-format=elf -o example.o example.py
# Use ELF format with x32 ABI for Linux x32 (x86-64 with 32-bit pointer)
python -m peachpy.x86_64 -mabi=x32 -mimage-format=elf -o example.o example.py
# Use ELF format with Native Client x86-64 ABI for Chromium x86-64
python -m peachpy.x86_64 -mabi=nacl -mimage-format=elf -o example.o example.py
What else? You can convert the program to Plan 9 assembly for use with Go programming language:
.. code-block:: bash
# Use Go ABI (asm version) with -S flag to generate assembly for Go x86-64 targets
python -m peachpy.x86_64 -mabi=goasm -S -o example_amd64.s example.py
# Use Go-p32 ABI (asm version) with -S flag to generate assembly for Go x86-64 targets with 32-bit pointers
python -m peachpy.x86_64 -mabi=goasm-p32 -S -o example_amd64p32.s example.py
If Plan 9 assembly is too restrictive for your use-case, generate ``.syso`` objects `which can be linked into Go programs <https://github.com/golang/go/wiki/GcToolchainTricks#use-syso-file-to-embed-arbitrary-self-contained-c-code>`_:
.. code-block:: bash
# Use Go ABI (syso version) to generate .syso objects for Go x86-64 targets
# Image format can be any (ELF/Mach-O/MS-COFF)
python -m peachpy.x86_64 -mabi=gosyso -mimage-format=elf -o example_amd64.syso example.py
# Use Go-p32 ABI (syso version) to generate .syso objects for Go x86-64 targets with 32-bit pointers
# Image format can be any (ELF/Mach-O/MS-COFF)
python -m peachpy.x86_64 -mabi=gosyso-p32 -mimage-format=elf -o example_amd64p32.syso example.py
See `examples <https://github.com/Maratyszcza/PeachPy/tree/master/examples>`_ for real-world scenarios of using PeachPy with ``make``, ``nmake`` and ``go generate`` tools.
Using PeachPy as a Python module
--------------------------------
When command-line tool does not provide sufficient flexibility, Python scripts can import PeachPy objects from ``peachpy`` and ``peachpy.x86_64`` modules and do arbitrary manipulations on output images, program structure, instructions, and bytecodes.
PeachPy as Inline Assembler for Python
**************************************
PeachPy links assembly and Python: it represents assembly instructions and syntax as Python classes, functions, and objects.
But it also works the other way around: PeachPy can represent your assembly functions as callable Python functions!
.. code-block:: python
from peachpy import *
from peachpy.x86_64 import *
x = Argument(int32_t)
y = Argument(int32_t)
with Function("Add", (x, y), int32_t) as asm_function:
reg_x = GeneralPurposeRegister32()
reg_y = GeneralPurposeRegister32()
LOAD.ARGUMENT(reg_x, x)
LOAD.ARGUMENT(reg_y, y)
ADD(reg_x, reg_y)
RETURN(reg_x)
python_function = asm_function.finalize(abi.detect()).encode().load()
print(python_function(2, 2)) # -> prints "4"
PeachPy as Instruction Encoder
******************************
PeachPy can be used to explore instruction length, opcodes, and alternative encodings:
.. code-block:: python
from peachpy.x86_64 import *
ADD(eax, 5).encode() # -> bytearray(b'\x83\xc0\x05')
MOVAPS(xmm0, xmm1).encode_options() # -> [bytearray(b'\x0f(\xc1'), bytearray(b'\x0f)\xc8')]
VPSLLVD(ymm0, ymm1, [rsi + 8]).encode_length_options() # -> {6: bytearray(b'\xc4\xe2uGF\x08'),
# 7: bytearray(b'\xc4\xe2uGD&\x08'),
# 9: bytearray(b'\xc4\xe2uG\x86\x08\x00\x00\x00')}
Tutorials
---------
- `Writing Go assembly functions with PeachPy <https://blog.gopheracademy.com/advent-2016/peachpy/>`_ by `Damian Gryski <https://github.com/dgryski>`_
- `Adventures in JIT compilation (Part 4) <http://eli.thegreenplace.net/2017/adventures-in-jit-compilation-part-4-in-python/>`_ by `Eli Bendersky <https://github.com/eliben>`_
Users
-----
- `NNPACK <https://github.com/Maratyszcza/NNPACK>`_ -- an acceleration layer for convolutional networks on multi-core CPUs.
- `ChaCha20 <https://git.schwanenlied.me/yawning/chacha20>`_ -- Go implementation of ChaCha20 cryptographic cipher.
- `AEZ <https://git.schwanenlied.me/yawning/aez>`_ -- Go implemenetation of AEZ authenticated-encryption scheme.
- `bp128 <https://github.com/robskie/bp128>`_ -- Go implementation of SIMD-BP128 integer encoding and decoding.
- `go-marvin32 <https://github.com/dgryski/go-marvin32>`_ -- Go implementation of Microsoft's Marvin32 hash function.
- `go-highway <https://github.com/dgryski/go-highway>`_ -- Go implementation of Google's Highway hash function.
- `go-metro <https://github.com/dgryski/go-metro>`_ -- Go implementation of MetroHash function.
- `go-stadtx <https://github.com/dgryski/go-stadtx>`_ -- Go implementation of Stadtx hash function.
- `go-sip13 <https://github.com/dgryski/go-sip13>`_ -- Go implementation of SipHash 1-3 function.
- `go-chaskey <https://github.com/dgryski/go-chaskey>`_ -- Go implementation of Chaskey MAC.
- `go-speck <https://github.com/dgryski/go-speck>`_ -- Go implementation of SPECK cipher.
- `go-bloomindex <https://github.com/dgryski/go-bloomindex>`_ - Go implementation of Bloom-filter based search index.
- `go-groupvariant <https://github.com/dgryski/go-groupvarint>`_ - SSE-optimized group varint integer encoding in Go.
- `Yeppp! <http://www.yeppp.info>`_ performance library. All optimized kernels in Yeppp! are implemented in PeachPy (uses old version of PeachPy with deprecated syntax).
Peer-Reviewed Publications
--------------------------
- Marat Dukhan "PeachPy: A Python Framework for Developing High-Performance Assembly Kernels", Python for High-Performance Computing (PyHPC) 2013 (`slides <http://www.yeppp.info/resources/peachpy-slides.pdf>`_, `paper <http://www.yeppp.info/resources/peachpy-paper.pdf>`_, code uses deprecated syntax)
- Marat Dukhan "PeachPy meets Opcodes: Direct Machine Code Generation from Python", Python for High-Performance Computing (PyHPC) 2015 (`slides <http://www.peachpy.io/slides/pyhpc2015>`_, `paper on ACM Digital Library <https://dl.acm.org/citation.cfm?id=2835860>`_).
Other Presentations
-------------------
- Marat Dukhan "Developing Low-Level Assembly Kernels in PeachPy", presentation on `The First BLIS Retreat Workshop <https://www.cs.utexas.edu/users/flame/BLISRetreat/>`_, 2013 (`slides <https://www.cs.utexas.edu/users/flame/BLISRetreat/BLISRetreatTalks/PeachPy.pdf>`_, code uses deprecated syntax)
- Marat Dukhan "Porting BLIS micro-kernels to PeachPy", presentation on `The Third BLIS Retreat Workshop <https://www.cs.utexas.edu/users/flame/BLISRetreat2015/>`_, 2015 (`slides <http://www.peachpy.io/slides/blis-retreat-2015/>`_)
- Marat Dukhan "Accelerating Data Processing in Go with SIMD Instructions", presentation on `Atlanta Go Meetup <http://www.meetup.com/Go-Users-Group-Atlanta>`_, September 16, 2015 (`slides <https://docs.google.com/presentation/d/1MYg8PyhEf0oIvZ9YU2panNkVXsKt5UQBl_vGEaCeB1k/edit?usp=sharing>`_)
Dependencies
------------
- Nearly all instruction classes in PeachPy are generated from `Opcodes Database <https://github.com/Maratyszcza/Opcodes>`_
- Instruction encodings in PeachPy are validated against `binutils <https://www.gnu.org/software/binutils/>`_ using auto-generated tests
- PeachPy uses `six <https://pythonhosted.org/six/>`_ and `enum34 <https://pypi.python.org/pypi/enum34>`_ packages as a compatibility layer between Python 2 and Python 3
Acknowledgements
----------------
.. image:: https://github.com/Maratyszcza/PeachPy/blob/master/logo/hpcgarage.png
:alt: HPC Garage logo
:target: http://hpcgarage.org/
.. image:: https://github.com/Maratyszcza/PeachPy/blob/master/logo/college-of-computing.gif
:alt: Georgia Tech College of Computing logo
:target: http://www.cse.gatech.edu/
This work is a research project at the HPC Garage lab in the Georgia Institute of Technology, College of Computing, School of Computational Science and Engineering.
The work was supported in part by grants to Prof. Richard Vuduc's research lab, `The HPC Garage <www.hpcgarage.org>`_, from the National Science Foundation (NSF) under NSF CAREER award number 0953100; and a grant from the Defense Advanced Research Projects Agency (DARPA) Computer Science Study Group program
Any opinions, conclusions or recommendations expressed in this software and documentation are those of the authors and not necessarily reflect those of NSF or DARPA.
|