1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228
|
# VaAPI
This page documents tracing and debugging the Video Acceleration API (VaAPI or
VA-API) on ChromeOS. The VA-API is an open-source library and API specification,
providing access to graphics hardware acceleration capabilities for video and
image processing. The VaAPI is used on ChromeOS on both Intel and AMD platforms.
[TOC]
## Overview
VaAPI code is developed upstream on the [VaAPI GitHub repository], from which
ChromeOS is a downstream client via the [libva] package, with packaged backends
for e.g. both [Intel] and [AMD].
[VaAPI GitHub repository]: https://github.com/intel/libva
[libva]: https://chromium.googlesource.com/chromiumos/overlays/chromiumos-overlay/+/master/x11-libs/libva/
[Intel]: https://chromium.googlesource.com/chromiumos/overlays/chromiumos-overlay/+/master/x11-libs/libva-intel-driver/
[AMD]: https://chromium.googlesource.com/chromiumos/overlays/chromiumos-overlay/+/master/media-libs/libva-amdgpu-driver/
## Tracing VaAPI video decoding
A simplified diagram of the buffer circulation is provided below. The "client"
is always a Renderer process via a Mojo/IPC communication. Essentially the VaAPI
Video Decode Accelerator ([VaVDA]) receives encoded BitstreamBuffers from the
"client", and sends them to the "va internals", which eventually produces
decoded video in PictureBuffers. The VaVDA may or may not use the `Vpp` unit for
pixel format adaptation, depending on the codec used, silicon generation and
other specifics.
```
K BitstreamBuffers +-----+ +-------------------+
C --------------------->| Va | -----> |
L <---------------------| VDA | <---- va internals |
I (encoded stuff) | | | |
E | | | +-----+ +----+
N <---------------------| | <----| |<------| lib|
T --------------------->| | ---->| Vpp |------>| va |
N +-----+ +-+-----+ M +----+
PictureBuffers VASurfaces
(decoded stuff)
```
*** aside
PictureBuffers are created by the "client" but allocated and filled in by the
VaVDA. `K` is unrelated to both `M` and `N`.
***
[VaVDA]: https://cs.chromium.org/chromium/src/media/gpu/vaapi/vaapi_video_decode_accelerator.h?type=cs&q=vaapivideodecodeaccelerator&sq=package:chromium&g=0&l=57
### Tracing memory consumption
Tracing memory consumption is done via the [MemoryInfra] system. Please take a
minute and read that document (in particular the [difference between
`effective_size` and `size`]). The VaAPI lives inside the GPU process (a.k.a.
Viz process), so please familiarize yourself with the [GPU Memory Tracing]
document. The VaVDA provides information by implementing the [Memory Dump
Provider] interface, but the information provided varies with the executing mode
as explained next.
#### Internal VASurfaces accountancy
The usage of the `Vpp` unit is controlled by the member variable
[`|decode_using_client_picture_buffers_|`] and is very advantageous in terms of
CPU, power and memory consumption (see [crbug.com/822346]).
* When [`|decode_using_client_picture_buffers_|`] is false, `libva` uses a set
of internally allocated VASurfaces that are accounted for in the
`gpu/vaapi/decoder` tracing category (see screenshot below). Each of these
VASurfaces is backed by a Buffer Object large enough to hold, at least, the
decoded image in YUV semiplanar format. In the diagram above, `M` varies: 4
for VP8, 9 for VP9, 4-12 for H264/AVC1 (see [`GetNumReferenceFrames()`]).

* When [`|decode_using_client_picture_buffers_|`] is true, `libva` can decode
directly on the client's PictureBuffers, `M = 0`, and the `gpu/vaapi/decoder`
category is not present in the GPU MemoryInfra.
[MemoryInfra]: https://chromium.googlesource.com/chromium/src/+/HEAD/docs/memory-infra/README.md#memoryinfra
[difference between `effective_size` and `size`]: https://chromium.googlesource.com/chromium/src/+/HEAD/docs/memory-infra#effective_size-vs_size
[GPU Memory Tracing]: ../memory-infra/probe-gpu.md
[Memory Dump Provider]: https://chromium.googlesource.com/chromium/src/+/HEAD/docs/memory-infra/adding_memory_infra_tracing.md
[`|decode_using_client_picture_buffers_|`]: https://cs.chromium.org/search/?q=decode_using_client_picture_buffers_&sq=package:chromium&type=cs
[crbug.com/822346]: https://crbug.com/822346
[`GetNumReferenceFrames()`]: https://cs.chromium.org/search/?q=GetNumReferenceFrames+file:%5Esrc/media/gpu/+package:%5Echromium$+file:%5C.cc&type=cs
#### PictureBuffers accountancy
VaVDA allocates storage for the N PictureBuffers provided by the client by means
of VaapiPicture{NativePixmapOzone}s, backed by NativePixmaps, themselves backed
by DmaBufs (the client only knows about the client Texture IDs). The GPU's
TextureManager accounts for these textures, but:
- They are not correctly identified as being backed by NativePixmaps (see
[crbug.com/514914]).
- They are not correctly linked back to the Renderer or ARC++ client on behalf
of whom the allocation took place, like e.g. [the probe-gpu example] (see
[crbug.com/721674]).
See e.g. the following ToT example for 10 1920x1080p textures (32bpp); finding
the desired `context_group` can be tricky.

[crbug.com/514914]: https://crbug.com/514914
[the probe-gpu example]: https://chromium.googlesource.com/chromium/src/+/HEAD/docs/memory-infra/probe-gpu.md#example
[crbug.com/721674]: https://crbug.com/721674
### Tracing power consumption
Power consumption is available on ChromeOS test/dev images via the command line
binary [`dump_intel_rapl_consumption`]; this tool averages the power
consumption of the four SoC domains over a configurable period of time, usually
a few seconds. These domains are, in the order presented by the tool:
* `pkg`: estimated power consumption of the whole SoC; in particular, this is a
superset of pp0 and pp1, including all accessory silicon, e.g. video
processing.
* `pp0`: CPU set.
* `pp1`/`gfx`: Integrated GPU or GPUs.
* `dram`: estimated power consumption of the DRAM, from the bus activity.
Googlers can read more about this topic under
[go/power-consumption-meas-in-intel].
`dump_intel_rapl_consumption` is usually run while a given workload is active
(e.g. a video playback) with an interval larger than a second to smooth out all
kinds of system services that would show up in smaller periods, e.g. WiFi.
```shell
dump_intel_rapl_consumption --interval_ms=2000 --repeat --verbose
```
E.g. on a nocturne main1, the average power consumption while playing back the
first minute of a 1080p VP9 [video], the average consumptions in watts are:
|`pkg` |`pp0` |`pp1`/`gfx` |`dram`|
| ---: | ---: | ---: | ---: |
| 2.63 | 1.44 | 0.29 | 0.87 |
As can be seen, `pkg` ~= `pp0` + `pp1` + 1W, this extra watt is the cost of all
the associated silicon, e.g. bridges, bus controllers, caches, and the media
processing engine.
[`dump_intel_rapl_consumption`]: https://chromium.googlesource.com/chromiumos/platform2/+/master/power_manager/tools/dump_intel_rapl_consumption.cc
[video]: https://commons.wikimedia.org/wiki/File:Big_Buck_Bunny_4K.webm
[go/power-consumption-meas-in-intel]: http://go/power-consumption-meas-in-intel
### Tracing CPU cycles and instantaneous buffer usage
TODO(mcasas): fill in this section.
## Verifying VaAPI installation and usage
### <a name="verify-driver"></a> Verify the VaAPI is correctly installed and can be loaded
`vainfo` is a small command line utility used to enumerate the supported
operation modes; it's developed in the [libva-utils] repository, but more
concretely available on ChromeOS dev images ([media-video/libva-utils] package)
and under Debian systems ([vainfo]). `vainfo` will try to load the appropriate
backend driver for the system and/or GPUs and fail if it cannot find/load it.
[libva-utils]: https://github.com/intel/libva-utils
[media-video/libva-utils]: https://chromium.googlesource.com/chromiumos/overlays/chromiumos-overlay/+/master/media-video/libva-utils
[vainfo]: https://packages.debian.org/sid/main/vainfo
### <a name="verify-vaapi"></a> Verify the VaAPI supports and/or uses a given codec
A few steps are customary to verify the support and use of a given codec.
To verify that the build and platform supports video acceleration, launch
Chromium and navigate to `chrome://gpu`, then:
* Search for the "Video Acceleration Information" Section: this should
enumerate the available accelerated codecs and resolutions.
* If this section is empty, oftentimes the "Log Messages" Section immediately
below might indicate an associated error, e.g.:
> vaInitialize failed: unknown libva error
that can usually be reproduced with `vainfo`, see the [previous
section](#verify-driver).
To verify that a given video is being played back using the accelerated video
decoding backend:
* Navigate to a url that causes a video to be played. Leave it playing.
* Navigate to the `chrome://media-internals` tab.
* Find the entry associated to the video-playing tab.
* Scroll down to "`Player Properties`" and check the "`video_decoder`" entry:
it should say "GpuVideoDecoder".
### VaAPI on Linux
This configuration is **unsupported** (see [docs/linux_hw_video_decode.md]), the
following instructions are provided only as a reference for developers to test
the code paths on a Linux machine.
* Follow the instructions under the [Linux build setup] document, adding the GN
argument `use_vaapi=true` in the args.gn file (please refer to the [Setting up
the build]) Section).
* To support proprietary codecs such as, e.g. H264/AVC1, add the options
`proprietary_codecs = true` and `ffmpeg_branding = "Chrome"` to the GN args.
* Build Chromium as usual.
At this point you should make sure the appropriate VA driver backend is working
correctly; try running `vainfo` from the command line and verify no errors show
up.
To run Chromium using VaAPI two arguments are necessary:
* `--ignore-gpu-blacklist`
* `--use-gl=desktop` or `--use-gl=egl`
```shell
./out/gn/chrome --ignore-gpu-blacklist --use-gl=egl
```
Note that you can set the environment variable `MESA_GLSL_CACHE_DISABLE=false`
if you want the gpu process to run in sandboxed mode, see
[crbug.com/264818](https://crbug.com/264818). To check if the running gpu
process is sandboxed or not, just open `chrome://gpu` and search for
`Sandboxed` in the driver information table. In addition, passing
`--gpu-sandbox-failures-fatal=yes` will prevent the gpu process to run in
non-sandboxed mode.
Refer to the [previous section](#verify-vaapi) to verify support and use of
the VaAPI.
[docs/linux_hw_video_decode.md]: https://chromium.googlesource.com/chromium/src/+/HEAD/docs/linux_hw_video_decode.md
[Linux build setup]: https://chromium.googlesource.com/chromium/src/+/HEAD/docs/linux_build_instructions.md
[Setting up the build]: https://chromium.googlesource.com/chromium/src/+/HEAD/docs/linux_build_instructions.md#setting-up-the-build
|