1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32
|
.. _Appendix 3:
================================================
Appendix 3: Considerations on Embedded Files
================================================
This chapter provides some background on embedded files support in PyMuPDF.
General
----------
Starting with version 1.4, PDF supports embedding arbitrary files as part ("Embedded File Streams") of a PDF document file (see chapter 3.10.3, pp. 184 of the :ref:`AdobeManual`).
In many aspects, this is comparable to concepts also found in ZIP files or the OLE technique in MS Windows. PDF embedded files do, however, *not* support directory structures as does the ZIP format. An embedded file can in turn contain embedded files itself.
Advantages of this concept are that embedded files are under the PDF umbrella, benefitting from its permissions / password protection and integrity aspects: all data, which a PDF may reference or even may be dependent on, can be bundled into it and so form a single, consistent unit of information.
In addition to embedded files, PDF 1.7 adds *collections* to its support range. This is an advanced way of storing and presenting meta information (i.e. arbitrary and extensible properties) of embedded files.
MuPDF Support
--------------
After adding initial support for collections (portfolios) and */EmbeddedFiles* in MuPDF version 1.11, this support was dropped again in version 1.15.
As a consequence, the cli utility *mutool* no longer offers access to embedded files.
PyMuPDF -- having implemented an */EmbeddedFiles* API in response in its version 1.11.0 -- was therefore forced to change gears starting with its version 1.16.0 (we never published a MuPDF v1.15.x compatible PyMuPDF).
We are now maintaining our own code basis supporting embedded files. This code makes use of basic MuPDF dictionary and array functions only.
PyMuPDF Support
------------------
We continue to support the full old API with respect to embedded files -- with only minor, cosmetic changes.
There even also is a new function, which delivers a list of all names under which embedded data are resgistered in a PDF, :meth:`Document.embeddedFileNames`.
|