File: extract-attachments.md

package info (click to toggle)
pypdf 5.4.0-1
  • links: PTS, VCS
  • area: main
  • in suites: trixie
  • size: 17,484 kB
  • sloc: python: 39,672; makefile: 35
file content (30 lines) | stat: -rw-r--r-- 806 bytes parent folder | download
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
# Extract Attachments

PDF documents can contain attachments. Attachments have a name, but it might not
be unique. For this reason, the value of `reader.attachments["attachment_name"]`
is a list.

You can extract all attachments like this:

```python
from pypdf import PdfReader

reader = PdfReader("example.pdf")

for name, content_list in reader.attachments.items():
    for i, content in enumerate(content_list):
        with open(f"{name}-{i}", "wb") as fp:
            fp.write(content)
```

Alternatively, you can retrieve them in an object-oriented fashion if you need
further details for these files:

```python
from pypdf import PdfReader

reader = PdfReader("example.pdf")

for attachment in reader.attachment_list:
    print(attachment.name, attachment.alternative_name, attachment.content)
```