llvm-py
Python Bindings for LLVM
»Home
»Examples
»Download
»User Guide
»Contribute
»License
»About

llvm-py provides Python bindings for LLVM. This document explains how you can setup and use it. A working knowledge of Python and a basic idea of LLVM is assumed.

Introduction

LLVM (Low-Level Virtual Machine) provides enough infrastructure to use it as the backend for your compiled, or JIT-compiled language. It provides extensive optimization support, and static and dynamic (JIT) backends for many platforms. See the website at http://www.llvm.org/ to discover more.

Python bindings for LLVM provides a gentler learning curve for working with the LLVM APIs. It should also be easier to create working prototypes and experimental languages using this medium.

Together with clang or llvm-gcc it also a provides a means to quickly instrument C and C++ sources. For e.g., llvm-gcc can be used to generate the LLVM assembly for a given C source file, which can then be loaded and manipulated (adding profiling code to every function, say) using a llvm-py based Python script.

License

Both LLVM and llvm-py are distributed under (different) permissive open source licenses. llvm-py uses the new BSD license. More information is available here.

Platforms

llvm-py has been built/tested/reported to work on various GNU/Linux flavours, *BSD, Mac OS X; on i386 and amd64 architectures. Windows is not supported, for a variety of reasons.

Versions

llvm-py 0.6 requires verion 2.7 of LLVM. It will not work with previous versions.

llvm-py has been built and tested with Python 2.6. It should work with Python 2.4 and 2.5. It has not been tried with Python 3.x (patches welcome).

Installation

llvm-py is distributed as a source tarball. You’ll need to build and install it before it can be used. At least the following will be required for this:

  • C and C++ compilers (gcc/g++)

  • Python itself

  • Python development files (headers and libraries)

  • LLVM, either installed or built

On debian-based systems, the first three can be installed with the command sudo apt-get install gcc g++ python python-dev. Ensure that your distro’s respository has the appropriate version of LLVM!

It does not matter which compiler LLVM itself was built with (g++, llvm-g++ or any other); llvm-py can be built with any compiler. It has been tried only with gcc/g++ though.

LLVM and --enable-pic

The result of an LLVM build is a set of static libraries and object files. The llvm-py contains an extension package that is built into a shared object (_core.so) which links to these static libraries and object files. It is therefore required that the LLVM libraries and object files be built with the -fPIC option (generate position independent code). Be sure to use the --enable-pic option while configuring LLVM (default is no PIC), like this:

~/llvm$ ./configure --enable-pic --enable-optimized

llvm-config

Inorder to build llvm-py, it’s build script needs to know from where it can invoke the llvm helper program, llvm-config. If you’ve installed LLVM, then this will be available in your PATH, and nothing further needs to be done. If you’ve built LLVM yourself, or for any reason llvm-config is not in your PATH, you’ll need to pass the full path of llvm-config to the build script.

You’ll need to be root to install llvm-py. Remember that your PATH is different from that of root, so even if llvm-config is in your PATH, it may not be available when you do sudo.

Steps

The commands illustrated below assume that the LLVM source is available under /home/mdevan/llvm. If you’ve a previous version of llvm-py installed, it is recommended to remove it first, as described below.

If you have llvm-config in your path, you can build and install llvm-py this way:

$ tar jxvf llvm-py-0.6.tar.bz2
$ cd llvm-py-0.6
$ python setup.py install --user

If you need to tell the build script where llvm-config is, do it this way:

$ tar jxvf llvm-py-0.6.tar.bz2
$ cd llvm-py-0.6
$ python setup.py install --user --llvm-config=/home/mdevan/llvm/Release/bin/llvm-config

To build a debug version of llvm-py, that links against the debug libraries of LLVM, use this:

$ tar jxvf llvm-py-0.6.tar.bz2
$ cd llvm-py-0.6
$ python setup.py build -g --llvm-config=/home/mdevan/llvm/Debug/bin/llvm-config
$ python setup.py install --user --llvm-config=/home/mdevan/llvm/Debug/bin/llvm-config

Be warned that debug binaries will be huge (100MB+) ! They are required only if you need to debug into LLVM also.

setup.py is a standard Python distutils script. See the Python documentation regarding Installing Python Modules and Distributing Python Modules for more information on such scripts.

Uninstall

If you’d installed llvm-py with the --user option, then llvm-py would be present under ~/.local/lib/python2.6/site-packages. Otherwise, it might be under /usr/lib/python2.6/site-packages or /usr/local/lib/python2.6/site-packages. The directory would vary with your Python version and OS flavour. Look around.

Once you’ve located the site-packages directory, the modules and the "egg" can be removed like so:

$ rm -rf <site-packages>/llvm <site-packages>/llvm_py-0.6-py2.6.egg-info

See the Python documentation for more information.

LLVM Concepts

This section explains a few concepts related to LLVM, not specific to llvm-py.

Intermediate Representation

The intermediate representation, or IR for short, is an in-memory data structure that represents executable code. The IR data structures allow for creation of types, constants, functions, function arguments, instructions, global variables and so on. For example, to create a function sum that takes two integers and returns their sum, we need to follow these steps:

  • create an integer type ti of required bitwidth

  • create a function type tf which takes two ti -s and returns another ti

  • create a function of type tf named sum

  • add a basic block to the function

  • using a helper object called an instruction builder, add two instructions into the basic block:

    1. an instruction to add the two arguments and store the result into a temporary variable

    2. a return instruction to return the value of the temporary variable

(A basic block is a block of instructions.)

LLVM has it’s own instruction set; the instructions used above (add and ret) are from this set. The LLVM instructions are at a higher level than the usual assembly language; for example there are instructions related to variable argument handling, exception handling, and garbage collection. These allow high-level languages to be represented cleanly in the IR.

SSA Form and PHI Nodes

All LLVM instructions are represented in the Static Single Assignment (SSA) form. Essentially, this means that any variable can be assigned to only once. Such a representation facilitates better optimization, among other benefits.

A consequence of single assignment are PHI (Φ) nodes. These are required when a variable can be assigned a different value based on the path of control flow. For example, the value of b at the end of execution of the snippet below:

a = 1;
if (v < 10)
  a = 2;
b = a;

cannot be determined statically. The value of 2 cannot be assigned to the original a, since a can be assigned to only once. There are two a 's in there, and the last assignment has to choose between which version to pick. This is accomplished by adding a PHI node:

a1 = 1;
if (v < 10)
  a2 = 2;
b = PHI(a1, a2);

The PHI node selects a1 or a2, depending on where the control reached the PHI node. The argument a1 of the PHI node is associated with the block "a1 = 1;" and a2 with the block "a2 = 2;".

PHI nodes have to be explicitly created in the LLVM IR. Accordingly the LLVM instruction set has an instruction called phi.

LLVM Assembly Language

The LLVM IR can be represented offline in two formats - a textual, human-readable form, similar to assembly language, called the LLVM assembly language (files with .ll extension) - a binary form, called the LLVM bitcode (files with .bc extension) All three formats (the in-memory IR, the LLVM assembly language and the LLVM bitcode) represent the same information. Each format can be converted into the other two formats (using LLVM APIs).

The LLVM demo page lets you type in C or C++ code, converts it into LLVM IR and outputs the IR as LLVM assembly language code.

Just to get a feel of the LLVM assembly language, here’s a function in C, and the corresponding LLVM assembly (as generated by the demo page):

/* compute sum of 1..n */
unsigned sum(unsigned n)
{
  if (n == 0)
    return 0;
  else
    return n + sum(n-1);
}

The corresponding LLVM assembly:

; ModuleID = '/tmp/webcompile/_7149_0.bc'
target datalayout = "e-p:64:64:64-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:64:64-f32:32:32-f64:64:64-v64:64:64-v128:128:128-a0:0:64-s0:64:64-f80:128:128-n8:16:32:64"
target triple = "x86_64-linux-gnu"

define i32 @sum(i32 %n) nounwind readnone {
entry:
  %0 = icmp eq i32 %n, 0                          ; <i1> [#uses=1]
  br i1 %0, label %bb2, label %bb1

bb1:                                              ; preds = %entry
  %1 = add i32 %n, -1                             ; <i32> [#uses=2]
  %2 = icmp eq i32 %1, 0                          ; <i1> [#uses=1]
  br i1 %2, label %sum.exit, label %bb1.i

bb1.i:                                            ; preds = %bb1
  %3 = add i32 %n, -2                             ; <i32> [#uses=1]
  %4 = tail call i32 @sum(i32 %3) nounwind        ; <i32> [#uses=1]
  %5 = add i32 %4, %1                             ; <i32> [#uses=1]
  br label %sum.exit

sum.exit:                                         ; preds = %bb1.i, %bb1
  %6 = phi i32 [ %5, %bb1.i ], [ 0, %bb1 ]        ; <i32> [#uses=1]
  %7 = add i32 %6, %n                             ; <i32> [#uses=1]
  ret i32 %7

bb2:                                              ; preds = %entry
  ret i32 0
}

Note the usage of SSA form. The long string called target datalayout is a specification of the platform ABI (like endianness, sizes of types, alignment etc.).

The LLVM Language Reference defines the LLVM assembly language including the entire instruction set.

Modules

Modules, in the LLVM IR, are similar to a single C language source file (.c file). A module contains:

  • functions (declarations and definitions)

  • global variables and constants

  • global type aliases (typedef-s)

Modules are top-level containers; all executable code representation is contained within modules. Modules may be combined (linked) together to give a bigger resultant module. During this process LLVM attempts to reconcile the references between the combined modules.

Optimization and Passes

LLVM provides quite a few optimization algorithms that work on the IR. These algorithms are organized as passes. Each pass does something specific, like combining redundant instructions. Passes need not always optimize the IR, it can also do other operations like inserting instrumentation code, or analysing the IR (the result of which can be used by passes that do optimizations) or even printing call graphs.

This LLVM documentation page describes all the available passes, and what they do.

LLVM does not automatically choose to run any passes, anytime. Passes have to be explicitly selected and run on each module. This gives you the flexibility to choose transformations and optimizations that are most suitable for the code in the module.

There is an LLVM binary called opt, which lets you run passes on bitcode files from the command line. You can write your own passes (in C/C++, as a shared library). This can be loaded and executed by opt. (Although llvm-py does not allow you to write your own passes, it does allow you to navigate the entire IR at any stage, and perform any transforms on it as you like.)

A "pass manager" is responsible for loading passes, selecting the correct objects to run them on (for example, a pass may work only on functions, individually) and actually runs them. opt is a command-line wrapper for the pass manager.

Bit code

TODO

Execution Engine, JIT and Interpreter

TODO

The llvm-py Package

The llvm-py is a Python package, consisting of 6 modules, that wrap over enough LLVM APIs to allow the implementation of your own compiler/VM backend in pure Python. If you’re come this far, you probably know why this is a good idea.

Out of the 6 modules, one is an “extension” module (i.e., it is written in C), and another one is a small private utility module, which leaves 4 public modules. These are:

  • llvm — top-level package, common classes (like exceptions)

  • llvm.core — IR-related APIs

  • llvm.ee — execution engine related APIs

  • llvm.passes — pass manager and passes related APIs

The modules contain only classes and (integer) constants. Mostly simple Python constructs are used (deliberately) — property() and property decorators are probably the most exotic animals around. All classes are "new style" classes. The APIs are designed to be navigable (and guessable!) once you know a few conventions. These conventions are highlighted in the sections below.

Here is a quick overview of the contents of each package:

llvm
  • LLVMException — exception class (currently the only one)

llvm.core
  • Module — represents an LLVM Module

  • Type — represents an LLVM Type

  • IntegerType, FunctionType, StructType, ArrayType, PointerType, VectorType  — derived classes of Type

  • TypeHandle — used for constructing recursive (self-referencing) types (e.g. linked list nodes)

  • Value — represents an LLVM Value

  • Constant, GlobalValue, GlobalVariable, Argument, Function, Instruction, CallOrInvokeInstruction, PHINode, SwitchInstruction —  various derived classes of Value

  • BasicBlock — another derived of Value, represents an LLVM basic block

  • Builder — used for creating instructions, wraps LLVM IRBuilder helper class

  • ModuleProvider — required to use modules in execution engine and pass manager

  • constants TYPE_* that represents various types

  • constants CC_* that represent calling conventions

  • constants ICMP_* and FCMP_* that represent integer and real comparison predicates (like less than, greater than etc.)

  • constants LINKAGE_* that represent linkage of symbols (external, internal etc.)

  • constants VISIBILITY_* that represents visibility of symbols (default, hidden, protected)

  • constants ATTR_* that represent function parameter attributes

llvm.ee
  • ExecutionEngine — represents an execution engine (which can be an either an interpreter or a JIT)

  • TargetData — represents the ABI of the target platform (details like sizes and alignment of primitive types, endinanness etc)

llvm.passes
  • PassManager — represents an LLVM pass manager

  • FunctionPassManager — represents an LLVM function pass manager

  • constants PASS_* that represent various passes

A note on the 'import’ing of these modules

Pythonically, modules are imported with the statement "import llvm.core". However, you might find it more convenient to import llvm-py modules thus:

from llvm import *
from llvm.core import *
from llvm.ee import *
from llvm.passes import *

This avoids quite some typing. Both conventions work, however.

Tip Python-style documentation strings (__doc__) are present in llvm-py. You can use the help() of the interactive Python interpreter or the object? of IPython to get online help. (Note: not complete yet!)

Module (llvm.core)

Modules are top-level container objects. You need to create a module object first, before you can add global variables, aliases or functions. Modules are created using the static method Module.new:

#!/usr/bin/env python

from llvm import *
from llvm.core import *

# create a module
my_module = Module.new('my_module')

The constructor of the Module class should not be used to instantiate a Module object. This is a common feature for all llvm-py classes.

Tip
Convention

All llvm-py objects are instantiated using static methods of corresponding classes. Constructors should not be used.

The argument my_module is a module identifier (a plain string). A module can also be constructed via deserialization from a bit code file, using the static method from_bitcode. This method takes a file-like object as argument, i.e., it should have a read() method that returns the entire data in a single call, as is the case with the builtin file object. Here is an example:

# create a module from a bit code file
bcfile = file("test.bc")
my_module = Module.from_bitcode(bcfile)

There is corresponding serialization method also, called to_bitcode:

# write out a bit code file from the module
bcfile = file("test.bc", "w")
my_module.to_bitcode(bcfile)

Modules can also be constructed from LLVM assembly files (.ll files). The static method from_assembly can be used for this. Similar to the from_bitcode method, this one also takes a file-like object as argument:

# create a module from an assembly file
llfile = file("test.ll")
my_module = Module.from_assembly(llfile)

Modules can be converted into their assembly representation by stringifying them (see below).

llvm.core.Module
Static Constructors
new(module_id)

Create a new Module instance with given module_id. The module_id should be a string.

from_bitcode(fileobj)

Create a new Module instance by deserializing the bitcode file represented by the file-like object fileobj.

from_assembly(fileobj)

Create a new Module instance by parsing the LLVM assembly file represented by the file-like object fileobj.

Properties
data_layout

A string representing the ABI of the platform.

target

A string like i386-pc-linux-gnu or i386-pc-solaris2.8.

pointer_size [read-only]

The size in bits of pointers, of the target platform. A value of zero represents llvm::Module::AnyPointerSize.

global_variables [read-only]

An iterable that yields GlobalVariable objects, that represent the global variables of the module.

functions [read-only]

An iterable that yields Function objects, that represent functions in the module.

Methods
get_type_named(name)

Return a Type object for the given alias name (typedef).

add_type_name(name, ty)

Add an alias (typedef) for the type ty with the name name.

delete_type_name(name)

Delete an alias with the name name.

add_global_variable(ty, name)

Add a global variable of the type ty with the name name. Returns a GlobalVariable object.

get_global_variable_named(name)

Get a GlobalVariable object corresponding to the global variable with the name name. Raises LLVMException if such a variable does not exist.

add_function(ty, name)

Add a function named name with the function type ty. ty must of an object of type FunctionType.

get_function_named(name)

Get a Function object corresponding to the function with the name name. Raises LLVMException if such a function does not exist.

get_or_insert_function(ty, name)

Like get_function_named, but adds the function first, if not present (like add_function).

verify()

Verify the correctness of the module. Raises LLVMException on errors.

to_bitcode(fileobj)

Write the bitcode representation of the module to the file-like object fileobj.

link_in(other)

Link in another module other into this module. Global variables, functions etc. are matched and resolved. The other module is no longer valid and should not be used after this operation. This API might be replaced with a full-fledged Linker class in the future.

Special Methods
__str__

Module objects can be stringified into it’s LLVM assembly language representation.

__eq__

Module objects can be compared for equality. Internally, this converts both arguments into their LLVM assembly representations and compares the resultant strings.

Tip
Convention

All llvm-py objects (where it makes sense), when stringified, return the LLVM assembly representation. print module_obj for example, prints the LLVM assembly form of the entire module.

Such objects, when compared for equality, internally compare these string representations.

Types (llvm.core)

Types are what you think they are. A instance of llvm.core.Type, or one of its derived classes, represent a type. llvm-py does not use as many classes to represent types as does LLVM itself. Some types are represented using llvm.core.Type itself and the rest are represented using derived classes of llvm.core.Type. As usual, an instance is created via one of the static methods of Type. These methods return an instance of either llvm.core.Type itself or one of its derived classes.

The following table lists all the available types along with the static method which has to be used to construct it and the name of the class whose object is actually returned by the static method.

Name Constructor Method Class

integer of bitwidth n

Type.int(n)

IntegerType

32-bit float

Type.float()

Type

64-bit double

Type.double()

Type

80-bit float

Type.x86_fp80()

Type

128-bit float (112-bit mantissa)

Type.fp128()

Type

128-bit float (two 64-bits)

Type.ppc_fp128()

Type

function

Type.function(r, p, v)

FunctionType

unpacked struct

Type.struct(eltys)

StructType

packed struct

Type.packed_struct(eltys)

StructType

array

Type.array(elty, count)

ArrayType

pointer to value of type pty

Type.pointer(pty, addrspc)

PointerType

vector

Type.vector(elty, count)

VectorType

void

Type.void()

Type

label

Type.label()

Type

opaque

Type.opaque()

Type

The class hierarchy is:

Type
  IntegerType
  FunctionType
  StructType
  ArrayType
  PointerType
  VectorType

The class-level documentation follows:

llvm.core.Type
Static Constructors
int(n)

Create an integer type of bit width n.

float()

Create a 32-bit floating point type.

double()

Create a 64-bit floating point type.

x86_fp80()

Create a 80-bit 80x87-style floating point type.

fp128()

Create a 128-bit floating point type (112-bit mantissa).

ppc_fp128()

Create a 128-bit float (two 64-bits).

function(ret, params, vararg=False)

Create a function type, having the return type ret (must be a Type), accepting the parameters params, where params is an iterable, that yields Type objects representing the type of each function argument in order. If vararg is True, function is variadic.

struct(eltys)

Create an unpacked structure. eltys is an iterable, that yields Type objects representing the type of each element in order.

packed_struct(eltys)

Like struct(eltys), but creates a packed struct.

array(elty, count)

Creates an array type, holding count elements, each of type elty (which should be a Type).

pointer(pty, addrspc=0)

Create a pointer to type pty (which should be a Type). addrspc is an integer that represents the address space of the pointer (see LLVM docs or ask on llvm-dev for more info).

void()

Creates a void type. Used for function return types.

label()

Creates a label type.

opaque()

Opaque type, used for creating self-referencing types.

Properties
kind [read-only]

A value (enum) representing the "type" of the object. It will be one of the following constants defined in llvm.core:

# Warning: do not rely on actual numerical values!
TYPE_VOID       = 0
TYPE_FLOAT      = 1
TYPE_DOUBLE     = 2
TYPE_X86_FP80   = 3
TYPE_FP128      = 4
TYPE_PPC_FP128  = 5
TYPE_LABEL      = 6
TYPE_INTEGER    = 7
TYPE_FUNCTION   = 8
TYPE_STRUCT     = 9
TYPE_ARRAY      = 10
TYPE_POINTER    = 11
TYPE_OPAQUE     = 12
TYPE_VECTOR     = 13
TYPE_METADATA   = 14
TYPE_UNION      = 15

Example:

assert Type.int().kind == TYPE_INTEGER
assert Type.void().kind == TYPE_VOID
Methods
refine

Used for constructing self-referencing types. See the documentation of TypeHandle objects.

Special Methods
__str__

Type objects can be stringified into it’s LLVM assembly language representation.

__eq__

Type objects can be compared for equality. Internally, this converts both arguments into their LLVM assembly representations and compares the resultant strings.

llvm.core.IntegerType
Base Class
  • llvm.core.Type

Properties
width [read-only]

The width of the integer type, in number of bits.

llvm.core.FunctionType
Base Class
  • llvm.core.Type

Properties
return_type [read-only]

A Type object, representing the return type of the function.

vararg [read-only]

True if the function is variadic.

args [read-only]

Returns an iterable object that yields Type objects that represent, in order, the types of the arguments accepted by the function. Used like this:

func_type = Type.function( Type.int(), [ Type.int(), Type.int() ] )
for arg in func_type.args:
    assert arg.kind == TYPE_INTEGER
    assert arg == Type.int()
assert func_type.arg_count == len(func_type.args)
arg_count [read-only]

The number of arguments. Same as len(obj.args), but faster.

llvm.core.StructType
Base Class
  • llvm.core.Type

Properties
packed [read-only]

True if the structure is packed (no padding between elements).

elements [read-only]

Returns an iterable object that yields Type objects that represent, in order, the types of the elements of the structure. Used like this:

struct_type = Type.struct( [ Type.int(), Type.int() ] )
for elem in struct_type.elements:
    assert elem.kind == TYPE_INTEGER
    assert elem == Type.int()
assert struct_type.element_count == len(struct_type.elements)
element_count [read-only]

The number of elements. Same as len(obj.elements), but faster.

llvm.core.ArrayType
Base Class
  • llvm.core.Type

Properties
element [read-only]

A Type object representing the type of the element of the array.

count [read-only]

The number of elements in the array.

llvm.core.PointerType
Base Class
  • llvm.core.Type

Properties
address_space [read-only]

The address space of the pointer.

pointee [read-only]

A Type object representing the type of the value pointed to.

llvm.core.VectorType
Base Class
  • llvm.core.Type

Properties
element [read-only]

A Type object representing the type of the element of the vector.

count [read-only]

The number of elements in the vector.

Here is an example that demonstrates the creation of types:

#!/usr/bin/env python

# integers
int_ty      = Type.int()
bool_ty     = Type.int(1)
int_64bit   = Type.int(64)

# floats
sprec_real  = Type.float()
dprec_real  = Type.double()

# arrays and vectors
intar_ty    = Type.array( int_ty, 10 )     # "typedef int intar_ty[10];"
twodim      = Type.array( intar_ty , 10 )  # "typedef int twodim[10][10];"
vec         = Type.array( int_ty, 10 )

# structures
s1_ty       = Type.struct( [ int_ty, sprec_real ] )
    # "struct s1_ty { int v1; float v2; };"

# pointers
intptr_ty   = Type.pointer(int_ty)         # "typedef int *intptr_ty;"

# functions
f1 = Type.function( int_ty, [ int_ty ] )
    # functions that take 1 int_ty and return 1 int_ty

f2 = Type.function( Type.void(), [ int_ty, int_ty ] )
    # functions that take 2 int_tys and return nothing

f3 = Type.function( Type.void(), ( int_ty, int_ty ) )
    # same as f2; any iterable can be used

fnargs = [ Type.pointer( Type.int(8) ) ]
printf = Type.function( Type.int(), fnargs, True )
    # variadic function

TypeHandle (llvm.core)

TypeHandle objects are used to create recursive types, like this linked list node structure in C:

struct node
{
    int data;
    struct node *next;
};

This can be realized in llvm-py like this:

#!/usr/bin/env python

from llvm.core import *

# create a type handle object
th = TypeHandle.new(Type.opaque())

# create the struct with an opaque* instead of self*
ts = Type.struct([ Type.int(), Type.pointer(th.type) ])

# unify the types
th.type.refine(ts)

# create a module, and add a "typedef"
m = Module.new('mod1')
m.add_type_name("struct.node", th.type)

# show what we created
print m

which gives the output:

; ModuleID = 'mod1'

%struct.node = type { i32, %struct.node* }

For more details on what is going on here, please refer the LLVM Programmer’s Manual section "LLVM Type Resolution". The TypeHandle class of llvm-py corresponds to llvm::PATypeHolder in C++. The above example is available as test/typehandle.py in the source distribution.

llvm.core.TypeHandle
Static Constructors
new(abstract_ty)

create a new TypeHandle instance, which holds a reference to the given abstract type abstract_ty. Typically, the abstract type used is Type.opaque().

Properties
type

returns the contained type. Typically the refine method is called on the returned type.

Values (llvm.core)

llvm.core.Value is the base class of all values computed by a program that may be used as operands to other values. A value has a type associated with it (an object of llvm.core.Type).

The class hierarchy is:

Value
  User
    Constant
      ConstantExpr
      ConstantAggregateZero
      ConstantInt
      ConstantFP
      ConstantArray
      ConstantStruct
      ConstantVector
      ConstantPointerNull
      UndefValue
      GlobalValue
        GlobalVariable
        Function
    Instruction
      CallOrInvokeInstruction
      PHINode
      SwitchInstruction
      CompareInstruction
  Argument
  BasicBlock

The Value class is abstract, it’s not meant to be instantiated. User is a Value that in turn uses (i.e., can refer to) other values (for e.g., a constant expression 1+2 refers to two constant values 1 and 2).

Constant-s represent constants that appear within code or as initializers of globals. They are constructed using static methods of Constant. Various types of constants are represented by various subclasses of Constant. However, most of them are empty and do not provide any additional attributes or methods over Constant.

The Function object represents an instance of a function type. Such objects contain Argument objects, which represent the actual, local-variable-like arguments of the function (not to be confused with the arguments returned by a function type object — these represent the type of the arguments).

The various Instruction-s are created by the Builder class. Most instructions are represented by Instruction itself, but there are a few subclasses that represent interesting instructions.

Value objects have a type (read-only), and a name (read-write).

llvm.core.Value
Properties
name

The name of the value.

type [read-only]

An llvm.core.Type object representing the type of the value.

uses [read-only]

The list of values (llvm.core.Value) that use this value.

use_count [read-only]

The number of values that use (refer) this value. Same as len(val.uses) but faster if you just want the count.

value_id [read-only]

Returns llvm::Value::getValueID(). Refer LLVM documentation for more info.

Special Methods
__str__

Value objects can be stringified into it’s LLVM assembly language representation.

__eq__

Value objects can be compared for equality. Internally, this converts both arguments into their LLVM assembly representations and compares the resultant strings.

User (llvm.core)

User-s are values that refer to other values. The values so refered can be retrived by the properties of User. This is the reverse of the Value.uses. Together these can be used to traverse the use-def chains of the SSA.

llvm.core.User
Base Class
  • llvm.core.Value

Properties
operands [read-only]

The list of operands (values, of type llvm.core.Value) that this value refers to.

operand_count [read-only]

The number of operands that this value referes to. Same as len(uses.operands) but faster if you just want the count.

Constants (llvm.core)

Constant-s represents constants that appear within the code. The values of such objects are known at creation time. Constants can be created from Python constants. A constant expression is also a constant — given a Constant object, an operation (like addition, subtraction etc) can be specified, to yield a new Constant object. Let’s see some examples:

#!/usr/bin/env python

ti = Type.int()                         # a 32-bit int type

k1 = Constant.int(ti, 42)               # "int k1 = 42;"
k2 = k1.add( Constant.int( ti, 10 ) )   # "int k2 = k1 + 10;"

tr = Type.float()

r1 = Constant.real(tr, "3.141592")      # create from a string
r2 = Constant.real(tr, 1.61803399)      # create from a Python float

The following constructors (static methods) can be used to create constants:

Constructor Method What It Creates

null(ty)

A null value (all zeros) of type ty

all_ones(ty)

All 1’s value of type ty

undef(ty)

An "undefined" value of type ty

int(ty, value)

Integer of type ty, with value value (a Python int or long)

int_signextend(ty, value)

Integer of signed type ty (use for signed types)

real(ty, value)

Floating point value of type ty, with value value (a Python float)

stringz(value)

A null-terminated string. value is a Python string

string(value)

As string(ty), but not null terminated

array(ty, consts)

Array of type ty, initialized with consts (an iterable yielding Constant objects of the appropriate type)

struct(ty, consts)

Struct (unpacked) of type ty, initialized with consts (an iterable yielding Constant objects of the appropriate type)

packed_struct(ty, consts)

As struct(ty, consts) but packed

vector(consts)

Vector, initialized with consts (an iterable yielding Constant objects of the appropriate type)

sizeof(ty)

Constant value representing the sizeof the type ty

The following operations on constants are supported. For more details on any operation, consult the Constant Expressions section of the LLVM Language Reference.

Method Operation

k.neg()

negation, same as 0 - k

k.not_()

1’s complement of k. Note trailing underscore.

k.add(k2)

k + k2, where k and k2 are integers.

k.fadd(k2)

k + k2, where k and k2 are floating-point.

k.sub(k2)

k - k2, where k and k2 are integers.

k.fsub(k2)

k - k2, where k and k2 are floating-point.

k.mul(k2)

k * k2, where k and k2 are integers.

k.fmul(k2)

k * k2, where k and k2 are floating-point.

k.udiv(k2)

Quotient of unsigned division of k with k2

k.sdiv(k2)

Quotient of signed division of k with k2

k.fdiv(k2)

Quotient of floating point division of k with k2

k.urem(k2)

Reminder of unsigned division of k with k2

k.srem(k2)

Reminder of signed division of k with k2

k.frem(k2)

Reminder of floating point division of k with k2

k.and_(k2)

Bitwise and of k and k2. Note trailing underscore.

k.or_(k2)

Bitwise or of k and k2. Note trailing underscore.

k.xor(k2)

Bitwise exclusive-or of k and k2.

k.icmp(icmp, k2)

Compare k with k2 using the predicate icmp. See table below for list of predicates for integer operands.

k.fcmp(fcmp, k2)

Compare k with k2 using the predicate fcmp. See table below for list of predicates for real operands.

k.shl(k2)

Shift k left by k2 bits.

k.lshr(k2)

Shift k logically right by k2 bits (new bits are 0s).

k.ashr(k2)

Shift k arithmetically right by k2 bits (new bits are same as previous sign bit).

k.gep(indices)

GEP, see LLVM docs.

k.trunc(ty)

Truncate k to a type ty of lower bitwidth.

k.sext(ty)

Sign extend k to a type ty of higher bitwidth, while extending the sign bit.

k.zext(ty)

Sign extend k to a type ty of higher bitwidth, all new bits are 0s.

k.fptrunc(ty)

Truncate floating point constant k to floating point type ty of lower size than k’s.

k.fpext(ty)

Extend floating point constant k to floating point type ty of higher size than k’s.

k.uitofp(ty)

Convert an unsigned integer constant k to floating point constant of type ty.

k.sitofp(ty)

Convert a signed integer constant k to floating point constant of type ty.

k.fptoui(ty)

Convert a floating point constant k to an unsigned integer constant of type ty.

k.fptosi(ty)

Convert a floating point constant k to a signed integer constant of type ty.

k.ptrtoint(ty)

Convert a pointer constant k to an integer constant of type ty.

k.inttoptr(ty)

Convert an integer constant k to a pointer constant of type ty.

k.bitcast(ty)

Convert k to a (equal-width) constant of type ty.

k.select(cond,k2,k3)

Replace value with k2 if the 1-bit integer constant cond is 1, else with k3.

k.extract_element(idx)

Extract value at idx (integer constant) from a vector constant k.

k.insert_element(k2,idx)

Insert value k2 (scalar constant) at index idx (integer constant) of vector constant k.

k.shuffle_vector(k2,mask)

Shuffle vector constant k based on vector constants k2 and mask.

Predicates for use with icmp instruction are listed below. All of these are integer constants defined in the llvm.core module.

Value Meaning

ICMP_EQ

Equality

ICMP_NE

Inequality

ICMP_UGT

Unsigned greater than

ICMP_UGE

Unsigned greater than or equal

ICMP_ULT

Unsigned less than

ICMP_ULE

Unsigned less than or equal

ICMP_SGT

Signed greater than

ICMP_SGE

Signed greater than or equal

ICMP_SLT

Signed less than

ICMP_SLE

Signed less than or equal

Predicates for use with fcmp instruction are listed below. All of these are integer constants defined in the llvm.core module.

Value Meaning

FCMP_FALSE

Always false

FCMP_OEQ

True if ordered and equal

FCMP_OGT

True if ordered and greater than

FCMP_OGE

True if ordered and greater than or equal

FCMP_OLT

True if ordered and less than

FCMP_OLE

True if ordered and less than or equal

FCMP_ONE

True if ordered and operands are unequal

FCMP_ORD

True if ordered (no NaNs)

FCMP_UNO

True if unordered: isnan(X) | isnan(Y)

FCMP_UEQ

True if unordered or equal

FCMP_UGT

True if unordered or greater than

FCMP_UGE

True if unordered, greater than or equal

FCMP_ULT

True if unordered, or less than

FCMP_ULE

True if unordered, less than or equal

FCMP_UNE

True if unordered or not equal

`FCMP_TRUE `

Always true

llvm.core.Constant
Base Class
  • llvm.core.Value

Static Constructors

See table of constructors above for full list.

Methods

See table of operations above for full list. There are no other methods.

Other Constant* Classes (llvm.core)

The following subclasses of Constant do not provide additional methods, they serve only to provide richer type information.

Subclass LLVM C++ Class Remarks

ConstantExpr

llvm::ConstantExpr

A constant expression

ConstantAggregateZero

llvm::ConstantAggregateZero

All-zero constant

ConstantInt

llvm::ConstantInt

An integer constant

ConstantFP

llvm::ConstantFP

A floating-point constant

ConstantArray

llvm::ConstantArray

An array constant

ConstantStruct

llvm::ConstantStruct

A structure constant

ConstantVector

llvm::ConstantVector

A vector constant

ConstantPointerNull

llvm::ConstantPointerNull

All-zero pointer constant

UndefValue

llvm::UndefValue

corresponds to undef of LLVM IR

These types are helpful in isinstance checks, like so:

ti = Type.int(32)
k1 = Constant.int(ti, 42)           # int32_t k1 = 42;
k2 = Constant.array(ti, [k1, k1])   # int32_t k2[] = { k1, k1 };

assert isinstance(k1, ConstantInt)
assert isinstance(k2, ConstantArray)

Global Value (llvm.core)

The class llvm.core.GlobalValue represents module-scope aliases, variables and functions. Global variables are represented by the sub-class llvm.core.GlobalVariable and functions by llvm.core.Function.

Global values have the read-write properties linkage, section, visibility and alignment. Use one of the following constants (from llvm.core) as values for linkage (see LLVM documentaion for details on each):

Value Equivalent LLVM Assembly Keyword

LINKAGE_EXTERNAL

externally_visible

LINKAGE_AVAILABLE_EXTERNALLY

available_externally

LINKAGE_LINKONCE_ANY

linkonce

LINKAGE_LINKONCE_ODR

linkonce_odr

LINKAGE_WEAK_ANY

weak

LINKAGE_WEAK_ODR

weak_odr

LINKAGE_APPENDING

appending

LINKAGE_INTERNAL

internal

LINKAGE_PRIVATE

private

LINKAGE_DLLIMPORT

dllimport

LINKAGE_DLLEXPORT

dllexport

LINKAGE_EXTERNAL_WEAK

extern_weak

LINKAGE_GHOST

deprecated — do not use

LINKAGE_COMMON

common

LINKAGE_LINKER_PRIVATE

linker_private

The section property can be assigned strings (like ".rodata"), which will be used if the target supports it. Visibility property can be set to one of thse constants (from llvm.core, see also LLVM docs):

Value Equivalent LLVM Assembly Keyword

VISIBILITY_DEFAULT

default

VISIBILITY_HIDDEN

hidden

VISIBILITY_PROTECTED

protected

The alignment property can be 0 (default), or can be set to a power of 2. The read-only property is_declaration can be used to check if the global is a declaration or not. The module to which the global belongs to can be retrieved using the module property (read-only).

llvm.core.GlobalValue
Base Class
  • llvm.core.Constant

Properties
linkage

The linkage type, takes one of the constants listed above (LINKAGE_*).

section

A string like ".rodata", indicating the section into which the global is placed into.

visibility

The visibility type, takes one of the constants listed above (VISIBILITY_*).

alignment

A power-of-2 integer indicating the boundary to align to.

is_declaration [read-only]

True if the global is a declaration, False otherwise.

module [read-only]

The module object to which this global belongs to.

Global Variable (llvm.core)

Global variables (llvm.core.GlobalVariable) are subclasses of llvm.core.GlobalValue and represent module-level variables. These can have optional initializers and can be marked as constants. Global variables can be created either by using the add_global_variable method of the Module class (see above), or by using the static method GlobalVariable.new.

# create a global variable using add_global_variable method
gv1 = module_obj.add_global_variable(Type.int(), "gv1")

# or equivalently, using a static constructor method
gv2 = GlobalVariable.new(module_obj, Type.int(), "gv2")

Existing global variables of a module can be accessed by name using module_obj.get_global_variable_named(name) or GlobalVariable.get. All existing global variables can be enumerated via iterating over the property module_obj.global_variables.

# retrieve a reference to the global variable gv1,
# using the get_global_variable_named method
gv1 = module_obj.get_global_variable_named("gv1")

# or equivalently, using the static `get` method:
gv2 = GlobalVariable.get(module_obj, "gv2")

# list all global variables in a module
for gv in module_obj.global_variables:
    print gv.name, "of type", gv.type

The initializer for a global variable can be set by assigning to the initializer property of the object. The is_global_constant property can be used to indicate that the variable is a global constant.

Global variables can be delete using the delete method. Do not use the object after calling delete on it.

# add an initializer 10 (32-bit integer)
gv.initializer = Constant.int( Type.int(), 10 )

# delete the global
gv.delete()
# DO NOT dereference `gv' beyond this point!
gv = None
llvm.core.GlobalVariable
Base Class
  • llvm.core.GlobalValue

Static Constructors
new(module_obj, ty, name)

Create a global variable named name of type ty in the module module_obj and return a GlobalVariable object that represents it.

get(module_obj, name)

Return a GlobalVariable object to represent the global variable named name in the module module_obj or raise LLVMException if such a variable does not exist.

Properties
initializer

The intializer of the variable. Set to llvm.core.Constant (or derived). Gets the initializer constant, or None if none exists.

global_constant

True if the variable is a global constant, False otherwise.

Methods
delete()

Deletes the global variable from it’s module. Do not hold any references to this object after calling delete on it.

Function (llvm.core)

Functions are represented by llvm.core.Function objects. They are contained within modules, and can be created either with the method module_obj.add_function or the static constructor Function.new. References to functions already present in a module can be retrieved via module.get_function_named or by the static constructor method Function.get. All functions in a module can be enumerated by iterating over module_obj.functions.

# create a type, representing functions that take an integer and return
# a floating point value.
ft = Type.function( Type.float(), [ Type.int() ] )

# create a function of this type
f1 = module_obj.add_function(ft, "func1")

# or equivalently, like this:
f2 = Function.new(module_obj, ft, "func2")

# get a reference to an existing function
f3 = module_obj.get_function_named("func3")

# or like this:
f4 = Function.get(module_obj, "func4")

# list all function names in a module
for f in module_obj.functions:
    print f.name

References to intrinsic functions can be got via the static constructor intrinsic. This returns a Function object, calling which is equivalent to invoking the intrinsic. The intrinsic method has to be called with a module object, an instrinic ID (which is a numeric constant) and a list of the types of arguments (which LLVM uses to resolve overloaded intrinsic functions).

# get a reference to the llvm.bswap intrinsic
bswap = Function.intrinsic(mod, INTR_BSWAP, [Type.int()])

# call it
builder.call(bswap, [value])

Here, the constant INTR_BSWAP, available from llvm.core, represents the LLVM intrinsic llvm.bswap. The [Type.int()] selects the version of llvm.bswap that has a single 32-bit integer argument. The list of intrinsic IDs defined as integer constants in llvm.core. These are:

INTR_ANNOTATION

INTR_ATOMIC_CMP_SWAP

INTR_ATOMIC_LOAD_ADD

INTR_ATOMIC_LOAD_AND

INTR_ATOMIC_LOAD_MAX

INTR_ATOMIC_LOAD_MIN

INTR_ATOMIC_LOAD_NAND

INTR_ATOMIC_LOAD_OR

INTR_ATOMIC_LOAD_SUB

INTR_ATOMIC_LOAD_UMAX

INTR_ATOMIC_LOAD_UMIN

INTR_ATOMIC_LOAD_XOR

INTR_ATOMIC_SWAP

INTR_BSWAP

INTR_CONVERTFF

INTR_CONVERTFSI

INTR_CONVERTFUI

INTR_CONVERTSIF

INTR_CONVERTSS

INTR_CONVERTSU

INTR_CONVERTUIF

INTR_CONVERTUS

INTR_CONVERTUU

INTR_COS

INTR_CTLZ

INTR_CTPOP

INTR_CTTZ

INTR_DBG_DECLARE

INTR_DBG_VALUE

INTR_EH_DWARF_CFA

INTR_EH_EXCEPTION

INTR_EH_RETURN_I32

INTR_EH_RETURN_I64

INTR_EH_SELECTOR

INTR_EH_SJLJ_CALLSITE

INTR_EH_SJLJ_LONGJMP

INTR_EH_SJLJ_LSDA

INTR_EH_SJLJ_SETJMP

INTR_EH_TYPEID_FOR

INTR_EH_UNWIND_INIT

INTR_EXP

INTR_EXP2

INTR_FLT_ROUNDS

INTR_FRAMEADDRESS

INTR_GCREAD

INTR_GCROOT

INTR_GCWRITE

INTR_INIT_TRAMPOLINE

INTR_INVARIANT_END

INTR_INVARIANT_START

INTR_LIFETIME_END

INTR_LIFETIME_START

INTR_LOG

INTR_LOG10

INTR_LOG2

INTR_LONGJMP

INTR_MEMCPY

INTR_MEMMOVE

INTR_MEMORY_BARRIER

INTR_MEMSET

INTR_OBJECTSIZE

INTR_PCMARKER

INTR_POW

INTR_POWI

INTR_PREFETCH

INTR_PTR_ANNOTATION

INTR_READCYCLECOUNTER

INTR_RETURNADDRESS

INTR_SADD_WITH_OVERFLOW

INTR_SETJMP

INTR_SIGLONGJMP

INTR_SIGSETJMP

INTR_SIN

INTR_SMUL_WITH_OVERFLOW

INTR_SQRT

INTR_SSUB_WITH_OVERFLOW

INTR_STACKPROTECTOR

INTR_STACKRESTORE

INTR_STACKSAVE

INTR_TRAP

INTR_UADD_WITH_OVERFLOW

INTR_UMUL_WITH_OVERFLOW

INTR_USUB_WITH_OVERFLOW

INTR_VACOPY

INTR_VAEND

INTR_VAR_ANNOTATION

INTR_VASTART

There are also target-specific intrinsics (which correspond to that target’s CPU instructions) available, but are omitted here for brevity. Full list can be seen from _intrinsic_ids.py. See the LLVM Language Reference for more information on the intrinsics, and the test directory in the source distribution for more examples. The intrinsic ID can be retrieved from a function object with the read-only property intrinsic_id.

The function’s calling convention can be set using the calling_convention property. The following (integer) constants defined in llvm.core can be used as values:

Value Equivalent LLVM Assembly Keyword
CC_C ccc
CC_FASTCALL fastcc
CC_COLDCALL coldcc
CC_X86_STDCALL x86_stdcallcc
CC_X86_FASTCALL x86_fastcallcc

See the LLVM docs for more information on each. Backend-specific numbered conventions can be directly passed as integers.

An arbitrary string identifying which garbage collector to use can be set or got with the property collector.

The value objects corresponding to the arguments of a function can be got using the read-only property args. These can be iterated over, and also be indexed via integers. An example:

# list all argument names and types
for arg in fn.args:
    print arg.name, "of type", arg.type

# change the name of the first argument
fn.args[0].name = "objptr"

Basic blocks (see later) are contained within functions. When newly created, a function has no basic blocks. They have to be added explicitly, using the append_basic_block method, which adds a new, empty basic block as the last one in the function. The first basic block of the function can be retrieved using the get_entry_basic_block method. The existing basic blocks can be enumerated by iterating over using the read-only property basic_blocks. The number of basic blocks can be got via basic_block_count method. Note that get_entry_basic_block is slightly faster than basic_blocks[0] and so is basic_block_count, over len(f.basic_blocks).

# add a basic block
b1 = fn.append_basic_block("entry")

# get the first one
b2 = fn.get_entry_basic_block()
b2 = fn.basic_blocks[0]  # slower than previous method

# print names of all basic blocks
for b in fn.basic_blocks:
    print b.name

# get number of basic blocks
n = fn.basic_block_count
n = len(fn.basic_blocks)  # slower than previous method

Functions can be deleted using the method delete. This deletes them from their containing module. All references to the function object should be dropped after delete has been called.

Functions can be verified with the verify method. Note that this may not work properly (aborts on errors).

Function attributes, as documented here, can be set on functions using the methods add_attribute and remove_attribute. The following values may be used to refer to the LLVM attributes:

Value Equivalent LLVM Assembly Keyword
ATTR_ALWAYS_INLINE alwaysinline
ATTR_INLINE_HINT inlinehint
ATTR_NO_INLINE noinline
ATTR_OPTIMIZE_FOR_SIZE optsize
ATTR_NO_RETURN noreturn
ATTR_NO_UNWIND nounwind
ATTR_READ_NONE readnone
ATTR_READONLY readonly
ATTR_STACK_PROTECT ssp
ATTR_STACK_PROTECT_REQ sspreq
ATTR_NO_REDZONE noredzone
ATTR_NO_IMPLICIT_FLOAT noimplicitfloat
ATTR_NAKED naked

Here is how attributes can be set and removed:

# create a function
ti = Type.int(32)
tf = Type.function(ti, [ti, ti])
m = Module.new('mod')
f = m.add_function(tf, 'sum')
print f
#   declare i32 @sum(i32, i32)

# add a couple of attributes
f.add_attribute(ATTR_NO_UNWIND)
f.add_attribute(ATTR_READONLY)
print f
#   declare i32 @sum(i32, i32) nounwind readonly
llvm.core.Function
Base Class
  • llvm.core.GlobalValue

Static Constructors
new(module_obj, func_ty, name)

Create a function named name of type func_ty in the module module_obj and return a Function object that represents it.

get(module_obj, name)

Return a Function object to represent the function named name in the module module_obj or raise LLVMException if such a function does not exist.

get_or_insert(module_obj, func_ty, name)

Similar to get, except that if the function does not exist it is added first, as though with new.

intrinsic(module_obj, intrinsic_id, types)

Create and return a Function object that refers to an intrinsic function, as described above.

Properties
calling_convention

The calling convention for the function, as listed above.

collector

A string holding the name of the garbage collection algorithm. See LLVM docs.

does_not_throw

Setting to True sets the ATTR_NO_UNWIND attribute, False removes it. Shortcut to using f.add_attribute(ATTR_NO_UNWIND) and f.remove_attribute(ATTR_NO_UNWIND).

args [read-only]

List of llvm.core.Argument objects representing the formal arguments of the function.

basic_block_count [read-only]

Number of basic blocks belonging to this function. Same as len(f.basic_blocks) but faster if you just want the count.

entry_basic_block [read-only]

The llvm.core.BasicBlock object representing the entry basic block for this function, or None if there are no basic blocks.

basic_blocks [read-only]

List of llvm.core.BasicBlock objects representing the basic blocks belonging to this function.

intrinsic_id [read-only]

Returns the ID of the intrinsic if this object represents an intrinsic instruction. Otherwise 0.

Methods
delete()

Deletes the function from it’s module. _Do not hold any references to this object after calling delete on it.

append_basic_block(name)

Add a new basic block named name, and return a corresponding llvm.core.BasicBlock object. Note that if this is not the entry basic block, you’ll have to add appropriate branch instructions from other basic blocks yourself.

add_attribute(attr)

Add an attribute attr to the function, from the set listed above.

remove_attribute(attr)

Remove the attribute attr of the function.

viewCFG()

Displays the control flow graph using the GraphViz tool.

viewCFGOnly()

Displays the control flow graph using the GraphViz tool, but omitting function bodies.

verify()

Verifies the function. See LLVM docs.

Argument (llvm.core)

The args property of llvm.core.Function objects yields llvm.core.Argument objects. This allows for setting attributes for functions arguments. Argument objects cannot be constructed from user code, the only way to get a reference to these are from Function objects.

The method add_attribute and remove_attribute can be used to add or remove the following attributes:

Value Equivalent LLVM Assembly Keyword
ATTR_ZEXT zeroext
ATTR_SEXT signext
ATTR_IN_REG inreg
ATTR_BY_VAL byval
ATTR_STRUCT_RET sret
ATTR_NO_ALIAS noalias
ATTR_NO_CAPTURE nocapture
ATTR_NEST nest

These method work exactly like the corresponding methods of the Function class above. Refer LLVM docs for information on what each attribute means.

The alignment of any argument can be set via the alignment property, to any power of 2.

llvm.core.Argument
Base Class
  • llvm.core.Value

Properties
alignment

The alignment of the argument. Must be a power of 2.

Methods
add_attribute(attr)

Add an attribute attr to the argument, from the set listed above.

remove_attribute(attr)

Remove the attribute attr of the argument.

Instructions (llvm.core)

An llvm.core.Instruction object represents an LLVM instruction. This class is the root of a small hierarchy:

Instruction
  CallOrInvokeInstruction
  PHINode
  SwitchInstruction
  CompareInstruction

Instructions are not created directly, but via a builder. The builder both creates instructions and adds them to a basic block at the same time. One way of getting instruction objects are from basic blocks.

Being derived from llvm.core.User, the instruction is-a user, i.e., an instruction in turn uses other values. The values an instruction uses are its operands. These may be accessed using operands property from the llvm.core.User base.

The name of the instruction (like add, mul etc) can be got via the opcode_name property. The basic_block property gives the basic block to which the instruction belongs to. Note that llvm-py does not allow free-standing instruction objects (i.e., all instructions are created contained within a basic block).

Classes of instructions can be got via the properties is_terminator, is_binary_op, is_shift etc. See below for the full list.

llvm.core.Instruction
Base Class
  • llvm.core.User

Properties
basic_block [read-only]

The basic block to which this instruction belongs to.

is_terminator [read-only]

True if the instruction is a terminator instruction.

is_binary_op [read-only]

True if the instruction is a binary operator.

is_shift [read-only]

True if the instruction is a shift instruction.

is_cast [read-only]

True if the instruction is a cast instruction.

is_logical_shift [read-only]

True if the instruction is a logical shift instruction.

is_arithmetic_shift [read-only]

True if the instruction is an arithmetic shift instruction.

is_associative [read-only]

True if the instruction is associative.

is_commutative [read-only]

True if the instruction is commutative.

is_volatile [read-only]

True if the instruction is a volatile load or store.

opcode [read-only]

The numeric opcode value of the instruction. Do not rely on the absolute value of this number, it may change with LLVM version.

opcode_name [read-only]

The name of the instruction, like add, sub etc.

CallOrInvokeInstruction (llvm.core)

The llvm.core.CallOrInvokeInstruction is a subclass of llvm.core.Instruction, and represents either a call or an invoke instruction.

llvm.core.CallOrInvokeInstruction
Base Class
  • llvm.core.Instruction

Properties
calling_convention

Get or set the calling convention. See the list above for possible values.

Methods
add_parameter_attribute(idx, attr)

Add an attribute attr to the idx-th argument. See above for possible values of attr.

remove_parameter_attribute(idx, attr)

Remove an attribute attr from the idx-th argument. See above for possible values of attr.

set_parameter_alignment(idx, align)

Set the alignment of the idx-th argument to align. align should be a power of two.

PHINode (llvm.core)

The llvm.core.PHINode is a subclass of llvm.core.Instruction, and represents the phi instruction. When created (using Builder.phi) the phi node contains no incoming blocks (nor their corresponding values). To add an incoming arc to the phi node, use the add_incoming method, which takes a source block (llvm.core.BasicBlock object) and a value (object of llvm.core.Value or of a class derived from it) that the phi node will take on if control branches in from that block.

llvm.core.PHINode
Base Class
  • llvm.core.Instruction

Properties
incoming_count [read-only]

The number of incoming arcs for this phi node.

Methods
add_incoming(value, block)

Add an incoming arc, from the llvm.core.BasicBlock object block, with the corresponding value value. value should be an object of llvm.core.Value (or of a descendent class). See above for possible values of attr.

get_incoming_value(idx)

Returns the idx-th incoming arc’s value.

get_incoming_block(idx)

Returns the idx-th incoming arc’s block.

SwitchInstruction (llvm.core)

(TODO describe)

llvm.core.SwitchInstruction
Base Class
  • llvm.core.Instruction

Methods
add_case(const, block)

Add another case to the switch statement. When the expression being evaluated equals const, then control branches to block. Here const must be of type llvm.core.ConstantInt.

CompareInstruction (llvm.core)

(TODO describe)

llvm.core.CompareInstruction
Base Class
  • llvm.core.Instruction

Properties
predicate [read-only]

The predicate of the compare instruction, one of the ICMP_* or FCMP_* constants.

Basic Block (llvm.core)

TODO

Builder (llvm.core)

TODO

Target Data (llvm.ee)

TODO

Execution Engine (llvm.ee)

TODO. For now, see test/example-jit.py.

Pass Manager and Passes (llvm.passes)

TODO. For now, see test/passes.py.

About the llvm-py Project

llvm-py lives at http://www.mdevan.org/llvm-py/. The code (subversion repository) and the issue tracker are hosted on the Google code hosting service, at http://code.google.com/p/llvm-py/. It is distributed under the new BSD license, the full license text is in the file named LICENSE available in the source distribution.

There is an llvm-py mailing list / users group: http://groups.google.com/group/llvm-py.

The entire llvm-py website is generated from marked up text files using the tool AsciiDoc. These text files and the (pre-)generated HTML pages are available in the source distribution.

llvm-py is an ongoing, live project. Your contributions in any form are most welcome. You can checkout the latest SVN HEAD from here.

Mahadevan R wrote llvm-py and works on it in his spare time. He can be reached at mdevan@mdevan.org.