1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157
|
#!/usr/bin/python3
# Halide tutorial lesson 11.
# This lesson demonstrates how to use Halide as a cross-compiler.
# This lesson can be built by invoking the command:
# make test_tutorial_lesson_11_cross_compilation
# in a shell with the current directory at python_bindings/
import halide as hl
from struct import unpack
def main():
# We'll define the simple one-stage pipeline that we used in lesson 10.
brighter = hl.Func("brighter")
x, y = hl.Var("x"), hl.Var("y")
# Declare the arguments.
offset = hl.Param(hl.UInt(8))
input = hl.ImageParam(hl.UInt(8), 2)
args = [input, offset]
# Define the hl.Func.
brighter[x, y] = input[x, y] + offset
# Schedule it.
brighter.vectorize(x, 16).parallel(y)
# The following line is what we did in lesson 10. It compiles an
# object file suitable for the system that you're running this
# program on. For example, if you compile and run this file on
# 64-bit linux on an x86 cpu with sse4.1, then the generated code
# will be suitable for 64-bit linux on x86 with sse4.1.
brighter.compile_to_file("lesson_11_host", args, "lesson_11_host")
# We can also compile object files suitable for other cpus and
# operating systems. You do this with an optional third argument
# to compile_to_file which specifies the target to compile for.
create_android = True
create_windows = True
create_ios = True
if create_android:
# Let's use this to compile a 32-bit arm android version of this code:
target = hl.Target()
target.os = hl.TargetOS.Android # The operating system
target.arch = hl.TargetArch.ARM # The CPU architecture
target.bits = 32 # The bit-width of the architecture
arm_features = [] # A list of features to set
target.set_features(arm_features)
# Pass the target as the last argument.
brighter.compile_to_file(
"lesson_11_arm_32_android", args, "lesson_11_arm_32_android", target
)
if create_windows:
# And now a Windows object file for 64-bit x86 with AVX and SSE 4.1:
target = hl.Target()
target.os = hl.TargetOS.Windows
target.arch = hl.TargetArch.X86
target.bits = 64
target.set_features([hl.TargetFeature.AVX, hl.TargetFeature.SSE41])
brighter.compile_to_file(
"lesson_11_x86_64_windows", args, "lesson_11_x86_64_windows", target
)
if create_ios:
# And finally an iOS mach-o object file for one of Apple's 32-bit
# ARM processors - the A6. It's used in the iPhone 5. The A6 uses
# a slightly modified ARM architecture called ARMv7s. We specify
# this using the target features field. Support for Apple's
# 64-bit ARM processors is very new in llvm, and still somewhat
# flaky.
target = hl.Target()
target.os = hl.TargetOS.IOS
target.arch = hl.TargetArch.ARM
target.bits = 32
target.set_features([hl.TargetFeature.ARMv7s])
brighter.compile_to_file(
"lesson_11_arm_32_ios", args, "lesson_11_arm_32_ios", target
)
# Now let's check these files are what they claim, by examining
# their first few bytes.
if create_android:
# 32-arm android object files start with the magic bytes:
# uint8_t []
arm_32_android_magic = [
0x7F,
ord("E"),
ord("L"),
ord("F"), # ELF format
1, # 32-bit
1, # 2's complement little-endian
1,
] # Current version of elf
length = len(arm_32_android_magic)
with open("lesson_11_arm_32_android.o", "rb") as f:
header_bytes = f.read(length)
header = list(unpack("B" * length, header_bytes))
assert header == arm_32_android_magic, (
"Unexpected header bytes in 32-bit arm object file: "
+ str([x == y for x, y in zip(header, arm_32_android_magic)])
)
if create_windows:
# 64-bit windows object files start with the magic 16-bit value 0x8664
# (presumably referring to x86-64)
# uint8_t []
win_64_magic = [0x64, 0x86]
with open("lesson_11_x86_64_windows.obj", "rb") as f:
header_bytes = f.read(2)
header = list(unpack("B" * 2, header_bytes))
assert header == win_64_magic, (
"Unexpected header bytes in 64-bit windows object file."
)
if create_ios:
# 32-bit arm iOS mach-o files start with the following magic bytes:
# uint32_t []
arm_32_ios_magic = [
0xFEEDFACE, # Mach-o magic bytes
# 0xfe, 0xed, 0xfa, 0xce, # Mach-o magic bytes
12, # CPU type is ARM
11, # CPU subtype is ARMv7s
1,
] # It's a relocatable object file.
with open("lesson_11_arm_32_ios.o", "rb") as f:
header_bytes = f.read(4 * 4)
header = list(unpack("I" * 4, header_bytes))
assert header == arm_32_ios_magic, (
"Unexpected header bytes in 32-bit arm ios object file."
)
# It looks like the object files we produced are plausible for
# those targets. We'll count that as a success for the purposes
# of this tutorial. For a real application you'd then need to
# figure out how to integrate Halide into your cross-compilation
# toolchain. There are several small examples of this in the
# Halide repository under the apps folder. See HelloAndroid and
# HelloiOS here:
# https:#github.com/halide/Halide/tree/master/apps/
print("Success!")
return 0
if __name__ == "__main__":
main()
|