1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147
|
What is liboggplay designed to do?
----------------------------------
There are several layers to a typical ogg playing application:
(1) Parsing of the ogg container format (handled by libogg and made usable by
liboggz) and routing of packets
(2) Decoding of individual ogg packets (handled by libtheora, libvorbis,
libfishsound, libcmml, etc. depending on contents)
(3) Maintenance of synchronisation between individual raw data streams
(4) Presentation of raw data to the user (via the screen / sound card / etc.)
liboggplay handles the first 3 layers for you, leaving just the actual
presentation for the application writer. It may not be immediately obvious
as to why the third layer is necessary, however the following simple explanation
provides justification.
Ogg packets are ordered by the last presentation time in each packet, so
there can be significant differences in the progression of the video and
audio streams at any given time. Consider the following (real) layout,
where each section between two vertical bars is an ogg page, and each entry
is a theora packet (T) or a vorbis packet (V). Each theora packet is marked
with the frame number contained within the packet, and the vorbis packet is
marked with the samples it contains. Packets which span pages are listed on
both pages, with dashes indicating that the packets are connected.
| T(0) | T(1)+T(2)-|-T(2)+T(3)+T(4)-|-T(4) | T(5)+T(6) | V(0-12992) |
In this case, there are:
* 6 complete theora packets (i.e frames), representing 240 milliseconds of
data, followed by:
* 1 page of vorbis data, which provides 294 milliseconds of data
This means that after decoding the first frame, a player must decode
significant additional theora data in order to reach the required vorbis packet
that matches the frame. Alternatively, the player must queue up the additional
encoded packets, and wait for the appropriate time to decode. This often leads
to a multithreaded design with seperate threads for file reading, demuxing,
decoding and presentation - an approach that works but is needlessly
complicated and very difficult to get right.
Liboggplay is designed to provide the ogg application writer with a very simple
API for ogg file decoding. It is designed such that the library is completely
single-threaded. The rationale for this is that ogg decoding is not an
expensive task in terms of CPU time - multicore systems do not need multiple
threads of execution to meet frame presentation deadlines and single core
systems do not benefit from multithreading during the decode process (in fact
they suffer, both because of the cost of message passing and because of the
increase in complexity). Furthermore, using a single-threaded paradigm for the
design of the liboggplay library helps ensure that this library is portable to
a wide range of systems.
Note that a seperate presentation thread may be necessary for players,
especially if the player is concerned with exact timing of frame delivery or
decoding of HD content. In these cases, this thread will exist entirely in the
player application and be used to smooth out irregularities in the cost of
decoding individual frames. However, it is anticipated that applications
intending only to decode SD or lower content will probably not require
an additional thread and will benefit from the lower complexity of such an
approach (for example, on a 1.86 MHz Pentium M processor, less than 1% of
frames in a typical SD movie appear to require more than 70% of a 25 fps
frameslice ***These figures are currently based on limited tests and should be
checked on more SD content / extended to other systems! ***)
Accordingly, using liboggplay the application writer simply:
(1) opens the ogg file through liboggplay
(2) inspects the tracks within the file and activates some or all of them
(3) sets a callback interval and starts liboggplay decoding
(4) on each callback, liboggplay has found enough data to satisfy the next
interval. The application:
(a) requests decoded versions of required data (frames, audio samples,
etc.)
(b) presents this data to the screen or audio card (or queues this data for
presentation should a seperate presentation thread be required).
This approach allows the application writer to effectively seperate
synchronisation and data retrieval issues from data presentation issues, and
should also avoid the requirement of a multi-threaded application.
Design principles of liboggplay
-------------------------------
* Liboggplay is single-threaded. Hence operations should not incur unexpected
delays. This means (for example) that data decoding should only occur at the
explicit request of the user.
* Liboggplay hides internal data representations from the user. The only
non-opaque data should be fixed-representation raw output (e.g. raw YUV data
or raw PCM data).
* All provided data to liboggplay should be checked where possible. NULL
pointers should generate an error, not a segmentation fault.
* liboggplay's data provision mechanism should be safe for multithreaded
players. Decoded data will be added to lock-free circular queues and
thread-safe functions for popping data off these queues will be provided.
* It is not necessary for applications based around liboggplay to buffer data
on the offchance that an expensive frame may arrive later. Guaranteed
synchronised provision of data from several streams means that playback can
stall when necessary, which should make liboggplay applications more
responsive.
Internal organisation of liboggplay
-----------------------------------
(see also libogg_data_layout.svg/.png)
(1) permanent structures
The Oggplay structure is the toplevel structure for the library. It contains:
* a pointer to the user callback
* a pointer to an OggPlayReader (see below)
* an array of pointers to OggPlayDecode structures (one for each track in an
ogg file)
This structure is initialised and used mainly from src/liboggplay/oggplay.c
The OggPlayReader structure provides an interface for accessing ogg file data
in a streaming manner. The interface can be found in
include/oggplay/oggplay_reader.h. A simple implementation can be found in
src/liboggplay/oggplay_file_reader.c.
The OggPlayDecode structures contain information relevant to a single stream
(or track) within the ogg file. The structure is actually used as the first
element of a track-format-specific structure (e.g. OggPlayVorbisdecode or
OggPlayTheoraDecode). The structure contains a pointer back to the OggPlay
structure, as well as pointers to linked lists of OggPlayDataHeader structures.
This structure is used mainly from src/liboggplay/oggplay_callback.c and
src/liboggplay/oggplay_query.c, although the data_list and end_of_data_list
pointers are manipulated in src/liboggplay/oggplay_data.c.
The OggPlayDataHeader structures contain information relevant to a single piece
of data from a track. This data:
* has a data type (and may be encoded or decoded) *not yet implemented!
* has a presentation time
* has a lock field
The structures are manipulated in src/liboggplay/oggplay_data.c.
(2) callback structures
The callback to the user application contains a pointer to an array of
OggPlayCallbackInfo structures. These provide a data type, the number
of records required to be collected in the current timeslice, the additional
number of available records, and a pointer to an array containing the records
themselves (the records are pointers to the OggPlayDataHeader objects).
The user application makes requests with this pointer (which is opaque to the
application) to retrieve individual records, which may need to be decoded on
demand.
|