File: README.md

package info (click to toggle)
mf2py 2.0.1-2
  • links: PTS
  • area: main
  • in suites: sid, trixie
  • size: 664 kB
  • sloc: python: 1,917; makefile: 19
file content (148 lines) | stat: -rw-r--r-- 4,571 bytes parent folder | download
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
![mf2py banner](https://microformats.github.io/mf2py/banner.png)

[![version](https://badge.fury.io/py/mf2py.svg?)](https://badge.fury.io/py/mf2py)
[![downloads](https://img.shields.io/pypi/dm/mf2py)](https://pypistats.org/packages/mf2py)
[![license](https://img.shields.io/pypi/l/mf2py?)](https://github.com/microformats/mf2py/blob/main/LICENSE)
[![python-version](https://img.shields.io/pypi/pyversions/mf2py)](https://badge.fury.io/py/mf2py)

## Welcome 👋

`mf2py` is a Python [microformats](https://microformats.org/wiki/microformats) parser with full support for `microformats2`, backwards-compatible support for `microformats1` and experimental support for `metaformats`.

## Installation 💻

To install `mf2py` run the following command:

```bash
$ pip install mf2py

```

## Quickstart 🚀

Import the library:

```pycon
>>> import mf2py

```

### Parse an HTML Document from a file or string

```pycon
>>> with open("test/examples/eras.html") as fp:
...     mf2json = mf2py.parse(doc=fp)
>>> mf2json
{'items': [{'type': ['h-entry'],
            'properties': {'name': ['Excited for the Taylor Swift Eras Tour'],
                           'author': [{'type': ['h-card'],
                                       'properties': {'name': ['James'],
                                                      'url': ['https://example.com/']},
                                       'value': 'James',
                                       'lang': 'en-us'}],
                           'published': ['2023-11-30T19:08:09'],
                           'featured': [{'value': 'https://example.com/eras.jpg',
                                         'alt': 'Eras tour poster'}],
                           'content': [{'value': "I can't decide which era is my favorite.",
                                        'lang': 'en-us',
                                        'html': "<p>I can't decide which era is my favorite.</p>"}],
                           'category': ['music', 'Taylor Swift']},
            'lang': 'en-us'}],
 'rels': {'webmention': ['https://example.com/mentions']},
 'rel-urls': {'https://example.com/mentions': {'text': '',
                                               'rels': ['webmention']}},
 'debug': {'description': 'mf2py - microformats2 parser for python',
           'source': 'https://github.com/microformats/mf2py',
           'version': '2.0.1',
           'markup parser': 'html5lib'}}

```

```pycon
>>> mf2json = mf2py.parse(doc="<a class=h-card href=https://example.com>James</a>")
>>> mf2json["items"]
[{'type': ['h-card'],
  'properties': {'name': ['James'],
                 'url': ['https://example.com']}}]

```

### Parse an HTML Document from a URL

```pycon
>>> mf2json = mf2py.parse(url="https://events.indieweb.org")
>>> mf2json["items"][0]["type"]
['h-feed']
>>> mf2json["items"][0]["children"][0]["type"]
['h-event']

```

## Experimental Options

The following options can be invoked via keyword arguments to `parse()` and `Parser()`.

### `expose_dom`

Use `expose_dom=True` to expose the DOM of embedded properties.

### `metaformats`

Use `metaformats=True` to include any [metaformats](https://microformats.org/wiki/metaformats)
found.

### `filter_roots`

Use `filter_roots=True` to filter known conflicting user names (e.g. Tailwind).
Otherwise provide a custom list to filter instead.

## Advanced Usage

`parse` is a convenience function for `Parser`. More sophisticated behaviors are
available by invoking the parser object directly.

```pycon
>>> with open("test/examples/festivus.html") as fp:
...     mf2parser = mf2py.Parser(doc=fp)

```

#### Filter by Microformat Type

```pycon
>>> mf2json = mf2parser.to_dict()
>>> len(mf2json["items"])
7
>>> len(mf2parser.to_dict(filter_by_type="h-card"))
3
>>> len(mf2parser.to_dict(filter_by_type="h-entry"))
4

```

#### JSON Output

```pycon
>>> json = mf2parser.to_json()
>>> json_cards = mf2parser.to_json(filter_by_type="h-card")

```

## Breaking Changes in `mf2py` 2.0

- Image `alt` support is now on by default.

## Notes 📝

- If you pass a BeautifulSoup document it may be modified.
- A hosted version of `mf2py` is available at [python.microformats.io](https://python.microformats.io).

## Contributing 🛠️

We welcome contributions and bug reports via GitHub.

This project follows the [IndieWeb code of conduct](https://indieweb.org/code-of-conduct). Please be respectful of other contributors and forge a spirit of positive co-operation without discrimination or disrespect.

## License 🧑‍⚖️

`mf2py` is licensed under an MIT License.