File: __init__.py

package info (click to toggle)
gumbo-parser 0.10.1%2Bdfsg-5
  • links: PTS, VCS
  • area: main
  • in suites: bookworm
  • size: 1,908 kB
  • sloc: ansic: 30,320; cpp: 3,694; python: 880; makefile: 92; sh: 15
file content (45 lines) | stat: -rw-r--r-- 1,284 bytes parent folder | download | duplicates (8)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
"""Gumbo HTML parser.

These are the Python bindings for Gumbo.  All public API classes and functions
are exported from this module.  They include:

- CTypes representations of all structs and enums defined in gumbo.h.  The
  naming convention is to take the C name and strip off the "Gumbo" prefix.

- A low-level wrapper around the gumbo_parse function, returning the classes
  exposed above.  Usage:

  import gumbo
  with gumboc.parse(text, **options) as output:
    do_stuff_with_doctype(output.document)
    do_stuff_with_parse_tree(output.root)

- Higher-level bindings that mimic the API provided by html5lib.  Usage:

  from gumbo import html5lib

  This requires that html5lib be installed (it uses their treebuilders), and is
  intended as a drop-in replacement.

- Similarly, higher-level bindings that mimic BeautifulSoup and return
  BeautifulSoup objects.  For this, use:

  import gumbo
  soup = gumbo.soup_parse(text, **options)

  It will give you back a soup object like BeautifulSoup.BeautifulSoup(text).
"""

from gumbo.gumboc import *

try:
  from gumbo import html5lib_adapter as html5lib
except ImportError:
  # html5lib not installed
  pass

try:
  from gumbo.soup_adapter import parse as soup_parse
except ImportError:
  # BeautifulSoup not installed
  pass