File: teflon.rst

package info (click to toggle)
mesa 25.2.7-1
  • links: PTS, VCS
  • area: main
  • in suites: sid
  • size: 311,960 kB
  • sloc: ansic: 2,185,172; xml: 1,028,239; cpp: 512,159; python: 76,146; asm: 38,329; yacc: 12,198; lisp: 4,114; lex: 3,429; sh: 855; makefile: 237
file content (165 lines) | stat: -rw-r--r-- 6,545 bytes parent folder | download | duplicates (7)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
TensorFlow Lite delegate
========================

Mesa contains a TensorFlow Lite delegate that can make use of NPUs to accelerate ML inference. It is implemented in the form of a *external delegate*, a shared library that the TensorFlow Lite runtime can load at startup. See https://www.tensorflow.org/api_docs/python/tf/lite/experimental/load_delegate.

.. list-table:: Supported acceleration hardware
   :header-rows: 1

   * - Gallium driver
     - NPU supported
     - Hardware tested
   * - Etnaviv
     - ``VeriSilicon VIPNano-QI.7120``
     - ``Amlogic A311D on Libre Computer AML-A311D-CC Alta and Khadas VIM3``
   * - Etnaviv
     - ``VeriSilicon VIPNano-SI+.8002``
     - ``NXP iMX8M Plus on Toradex Verdin SoM``

.. list-table:: Tested models
   :header-rows: 1

   * - Model name
     - Data type
     - Link (may be outdated)
     - Status
     - Inference speed on AML-A311D-CC Alta
     - Inference speed on Verdin iMX8M Plus
   * - MobileNet V1
     - UINT8
     - http://download.tensorflow.org/models/mobilenet_v1_2018_08_02/mobilenet_v1_1.0_224_quant.tgz
     - Fully supported
     - ~6.6 ms
     - ~7.9 ms
   * - MobileNet V2
     - UINT8
     - https://storage.googleapis.com/mobilenet_v2/checkpoints/quantized_v2_224_100.tgz
     - Fully supported
     - ~6.9 ms
     - ~8.0 ms
   * - SSDLite MobileDet
     - UINT8
     - https://raw.githubusercontent.com/google-coral/test_data/master/ssdlite_mobiledet_coco_qat_postprocess.tflite
     - Fully supported
     - ~24.8 ms
     - ~24.4 ms

Build
-----

Build Mesa as usual, with the -Dteflon=true argument.

Example instructions:

.. code-block:: console

   # Install build dependencies
   ~ # apt-get -y build-dep mesa
   ~ # apt-get -y install git cmake

   # Download sources
   ~ $ git clone https://gitlab.freedesktop.org/mesa/mesa.git

   # Build Mesa
   ~ $ cd mesa
   mesa $ meson setup build -Dgallium-drivers=etnaviv -Dvulkan-drivers= -Dteflon=true
   mesa $ meson compile -C build

Install runtime dependencies
----------------------------

Your board should have booted into a mainline 6.7 (6.8 for the i.MX8MP) or greater kernel.

.. code-block:: console

   # Install Python 3.10 and dependencies (as root)
   ~ # echo deb-src http://deb.debian.org/debian testing main >> /etc/apt/sources.list
   ~ # echo deb http://deb.debian.org/debian unstable main >> /etc/apt/sources.list
   ~ # echo 'APT::Default-Release "testing";' >> /etc/apt/apt.conf
   ~ # apt-get update
   ~ # apt-get -y install python3.10 python3-pytest python3-exceptiongroup

   # Install TensorFlow Lite Python package (as non-root)
   ~ $ python3.10 -m pip install --break-system-packages tflite-runtime==2.13.0

   # For the classification.py script mentioned below, you will need PIL
   ~ $ python3.10 -m pip install --break-system-packages pillow

Do some inference with MobileNetV1
----------------------------------

Run the above for a quick way of checking that the setup is correct and the NPU is accelerating the inference. It assumes you have followed the steps above so Python 3.10 and dependencies have been installed, and assumes that Mesa was built to the ``./build`` directory.

You can use any image that prominently features one of the objects in the ``src/gallium/frontends/teflon/tests/labels_mobilenet_quant_v1_224.txt`` file.

This example script has been based from the code in https://github.com/tensorflow/tensorflow/tree/master/tensorflow/lite/examples/python.

.. code-block:: console

   ~ $ cd mesa/
   mesa $ TEFLON_DEBUG=verbose ETNA_MESA_DEBUG=ml_dbgs python3.10 src/gallium/frontends/teflon/tests/classification.py \
          -i ~/tensorflow/assets/grace_hopper.bmp \
          -m src/gallium/targets/teflon/tests/mobilenet_v1_1.0_224_quant.tflite \
          -l src/gallium/frontends/teflon/tests/labels_mobilenet_quant_v1_224.txt \
          -e build/src/gallium/targets/teflon/libteflon.so

   Loading external delegate from build/src/gallium/targets/teflon/libteflon.so with args: {}
   Teflon delegate: loaded etnaviv driver

   teflon: compiling graph: 89 tensors 28 operations
   idx scale     zp has_data size        
   =======================================
   0 0.023528   0 no       1x1x1x1024
   1 0.166099  42 no       1x1x1x1001
   2 0.000117   0 yes      1001x0x0x0
   3 0.004987  4a yes      1001x1x1x1024
   4 0.166099  42 no       1x1001x0x0
   5 0.166099  42 yes      2x0x0x0
   6 0.000171   0 yes      32x0x0x0
   7 0.023528   0 no       1x112x112x32
   8 0.021827  97 yes      32x3x3x3
   9 0.023528   0 no       1x14x14x512
   ...

   idx type    in out  operation type-specific
   ================================================================================================
   0 CONV    88   7  w: 8 b: 6 stride: 2 pad: SAME
   1 DWCONV   7  33  w: 35 b: 34 stride: 1 pad: SAME
   2 CONV    33  37  w: 38 b: 36 stride: 1 pad: SAME
   3 DWCONV  37  39  w: 41 b: 40 stride: 2 pad: SAME
   4 CONV    39  43  w: 44 b: 42 stride: 1 pad: SAME
   5 DWCONV  43  45  w: 47 b: 46 stride: 1 pad: SAME
   6 CONV    45  49  w: 50 b: 48 stride: 1 pad: SAME
   7 DWCONV  49  51  w: 53 b: 52 stride: 2 pad: SAME
   8 CONV    51  55  w: 56 b: 54 stride: 1 pad: SAME
   9 DWCONV  55  57  w: 59 b: 58 stride: 1 pad: SAME
   10 CONV    57  61  w: 62 b: 60 stride: 1 pad: SAME
   11 DWCONV  61  63  w: 65 b: 64 stride: 2 pad: SAME
   12 CONV    63  67  w: 68 b: 66 stride: 1 pad: SAME
   13 DWCONV  67  69  w: 71 b: 70 stride: 1 pad: SAME
   14 CONV    69  73  w: 74 b: 72 stride: 1 pad: SAME
   15 DWCONV  73  75  w: 77 b: 76 stride: 1 pad: SAME
   16 CONV    75  79  w: 80 b: 78 stride: 1 pad: SAME
   17 DWCONV  79  81  w: 83 b: 82 stride: 1 pad: SAME
   18 CONV    81  85  w: 86 b: 84 stride: 1 pad: SAME
   19 DWCONV  85   9  w: 11 b: 10 stride: 1 pad: SAME
   20 CONV     9  13  w: 14 b: 12 stride: 1 pad: SAME
   21 DWCONV  13  15  w: 17 b: 16 stride: 1 pad: SAME
   22 CONV    15  19  w: 20 b: 18 stride: 1 pad: SAME
   23 DWCONV  19  21  w: 23 b: 22 stride: 2 pad: SAME
   24 CONV    21  25  w: 26 b: 24 stride: 1 pad: SAME
   25 DWCONV  25  27  w: 29 b: 28 stride: 1 pad: SAME
   26 CONV    27  31  w: 32 b: 30 stride: 1 pad: SAME
   27 POOL    31   0  filter: 0x0 stride: 0 pad: VALID

   teflon: compiled graph, took 10307 ms
   teflon: invoked graph, took 21 ms
   teflon: invoked graph, took 17 ms
   teflon: invoked graph, took 17 ms
   teflon: invoked graph, took 17 ms
   teflon: invoked graph, took 16 ms
   0.866667: military uniform
   0.031373: Windsor tie
   0.015686: mortarboard
   0.007843: bow tie
   0.007843: academic