1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65
|
# SCC Reader
## Overview
The SCC reader (`ttconv/scc/reader.py`) converts [SCC](https://docs.inqscribe.com/2.2/format_scc.html) documents into
the [data model](./data-model.md).
## Usage
The SCC reader accepts as input a [Scenarist Closed
Caption](https://www.govinfo.gov/content/pkg/CFR-2007-title47-vol1/pdf/CFR-2007-title47-vol1-sec15-119.pdf) document that conforms
to the [CEA-608](https://shop.cta.tech/products/line-21-data-services) encoding specification and returns a `model.ContentDocument`
object.
```python
import ttconv.scc.reader as scc_reader
doc = scc_reader.to_model("src/test/resources/scc/pop-on.scc")
# doc can then manipulated and written out using any of the writer modules
```
## Architecture
The input SCC document is read line-by-line. For each line, the time code prefix and following CEA-608 codes (see the
`ttconv/scc/codes` package) are processed to generate `SccCaptionParagraph` instances. Each paragraph associates a time and region
with the text (including line-breaks) it contains (see definition in `ttconv/scc/content.py`). The paragraphs are then converted to
a `model.P`, part of the output `model.ContentDocument` (see the `SccCaptionParagraph::to_paragraph()` method in
`ttconv/scc/paragraph.py`), following the recommendations specified in [SMPTE RP
2052-10:2013](https://ieeexplore.ieee.org/document/7289645).
The paragraph generation is based on the buffer-based mechanism defined in the CEA-608 format: a buffer of caption
content is filled while some other content is displayed. These buffering and displaying processes can be synchronous or
asynchronous, based on the caption style (see `ttconv/scc/style.py`).
`ttconv/scc/utils.py` contains utility functions to convert geometrical dimensions of different units,
and `ttconv/scc/disassembly.py` handles CEA-608 codes conversion to the _disassembly_ format.
## Disassembly
The SCC reader can dump SCC content in the [Disassemby](http://www.theneitherworld.com/mcpoodle/SCC_TOOLS/DOCS/SCC_TOOLS.HTML#ccd)
format, which is an ad-hoc a human-readable description of the SCC content.
```python
import ttconv.scc.reader as scc_reader
print(scc_reader.to_disassembly("src/test/resources/scc/pop-on.scc"))
```
For instance, the following SCC line:
```
00:00:00:22 9425 9425 94ad 94ad 9470 9470 4c6f 7265 6d20 6970 7375 6d20 646f 6c6f 7220 7369 7420 616d 6574 2c80
```
is converted to:
```
00:00:00:22 {RU2}{RU2}{CR}{CR}{1500}{1500}Lorem ipsum dolor sit amet,
```
This is useful for debugging.
## Tests
Sample SCC files can be found in the `src/test/resources/scc` directory.
|