File: urlutils.rst

package info (click to toggle)
python-boltons 25.0.0-1
  • links: PTS, VCS
  • area: main
  • in suites: forky, sid, trixie
  • size: 1,236 kB
  • sloc: python: 12,133; makefile: 159; sh: 7
file content (189 lines) | stat: -rw-r--r-- 5,955 bytes parent folder | download | duplicates (4)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
``urlutils`` - Structured URL
=============================

.. automodule:: boltons.urlutils

.. versionadded:: 17.2

The URL type
------------

.. autoclass:: boltons.urlutils.URL

   .. attribute:: URL.scheme

      The scheme is an ASCII string, normally lowercase, which
      specifies the semantics for the rest of the URL, as well as
      network protocol in many cases. For example, "http" in
      "http://hatnote.com".

   .. attribute:: URL.username

      The username is a string used by some schemes for
      authentication. For example, "public" in
      "ftp://public@example.com".

   .. attribute:: URL.password

      The password is a string also used for
      authentication. Technically deprecated by `RFC 3986 Section
      7.5`_, they're still used in cases when the URL is private or
      the password is public. For example "password" in
      "db://private:password@127.0.0.1".

      .. _RFC 3986 Section 7.5: https://tools.ietf.org/html/rfc3986#section-7.5

   .. attribute:: URL.host

      The host is a string used to resolve the network location of the
      resource, either empty, a domain, or IP address (v4 or
      v6). "example.com", "127.0.0.1", and "::1" are all good examples
      of host strings.

      Per spec, fully-encoded output from :attr:`~URL.to_text()` is
      `IDNA encoded`_ for compatibility with DNS.

      .. _IDNA encoded: https://en.wikipedia.org/wiki/Internationalized_domain_name#Example_of_IDNA_encoding

   .. attribute:: URL.port

      The port is an integer used, along with :attr:`host`, in
      connecting to network locations. ``8080`` is the port in
      "http://localhost:8080/index.html".

      .. note::

         As is the case for 80 for HTTP and 22 for SSH, many schemes
         have default ports, and `Section 3.2.3 of RFC 3986`_ states
         that when a URL's port is the same as its scheme's default
         port, the port should not be emitted::

           >>> URL(u'https://github.com:443/mahmoud/boltons').to_text()
           u'https://github.com/mahmoud/boltons'

         Custom schemes can register their port with
         :func:`~boltons.urlutils.register_scheme`. See
         :attr:`URL.default_port` for more info.

         .. _Section 3.2.3 of RFC 3986: https://tools.ietf.org/html/rfc3986#section-3.2.3

   .. attribute:: URL.path

      The string starting with the first leading slash after the
      authority part of the URL, ending with the first question
      mark. Often percent-quoted for network use. "/a/b/c" is the path
      of "http://example.com/a/b/c?d=e".


   .. attribute:: URL.path_parts

      The :class:`tuple` form of :attr:`~URL.path`, split on
      slashes. Empty slash segments are preserved, including that of
      the leading slash::

        >>> url = URL(u'http://example.com/a/b/c')
        >>> url.path_parts
        (u'', u'a', u'b', u'c')


   .. attribute:: URL.query_params

      An instance of :class:`~boltons.urlutils.QueryParamDict`, an
      :class:`~boltons.dictutils.OrderedMultiDict` subtype, mapping
      textual keys and values which follow the first question mark
      after the :attr:`path`. Also available as the handy alias
      ``qp``::

        >>> url = URL('http://boltons.readthedocs.io/en/latest/?utm_source=docs&sphinx=ok')
        >>> url.qp.keys()
        [u'utm_source', u'sphinx']

      Also percent-encoded for network use cases.

   .. attribute:: URL.fragment

      The string following the first '#' after the
      :attr:`query_params` until the end of the URL. It has no
      inherent internal structure, and is percent-quoted.

   .. automethod:: URL.from_parts
   .. automethod:: URL.to_text

   .. autoattribute:: URL.default_port
   .. autoattribute:: URL.uses_netloc

   .. automethod:: URL.get_authority

   .. automethod:: URL.normalize
   .. automethod:: URL.navigate



Related functions
~~~~~~~~~~~~~~~~~

.. autofunction:: boltons.urlutils.find_all_links

.. autofunction:: boltons.urlutils.register_scheme


Low-level functions
-------------------

A slew of functions used internally by :class:`~boltons.urlutils.URL`.

.. autofunction:: boltons.urlutils.parse_url
.. autofunction:: boltons.urlutils.parse_host
.. autofunction:: boltons.urlutils.parse_qsl
.. autofunction:: boltons.urlutils.resolve_path_parts

.. autoclass:: boltons.urlutils.QueryParamDict
   :members:

Quoting
~~~~~~~

URLs have many parts, and almost as many individual "quoting"
(encoding) strategies.

.. autofunction:: boltons.urlutils.quote_userinfo_part
.. autofunction:: boltons.urlutils.quote_path_part
.. autofunction:: boltons.urlutils.quote_query_part
.. autofunction:: boltons.urlutils.quote_fragment_part

There is however, only one unquoting strategy:

.. autofunction:: boltons.urlutils.unquote

Useful constants
----------------

.. attribute:: boltons.urlutils.SCHEME_PORT_MAP

   A mapping of URL schemes to their protocols' default
   ports. Painstakingly assembled from the `IANA scheme registry`_,
   `port registry`_, and independent research.

   Keys are lowercase strings, values are integers or None, with None
   indicating that the scheme does not have a default port (or may not
   support ports at all)::

     >>> boltons.urlutils.SCHEME_PORT_MAP['http']
     80
     >>> boltons.urlutils.SCHEME_PORT_MAP['file']
     None

   See :attr:`URL.port` for more info on how it is used. See
   :attr:`~boltons.urlutils.NO_NETLOC_SCHEMES` for more scheme info.

   Also `available in JSON`_.

   .. _IANA scheme registry: https://www.iana.org/assignments/uri-schemes/uri-schemes.xhtml
   .. _port registry: https://www.iana.org/assignments/service-names-port-numbers/service-names-port-numbers.xhtml
   .. _available in JSON: https://gist.github.com/mahmoud/2fe281a8daaff26cfe9c15d2c5bf5c8b


.. attribute:: boltons.urlutils.NO_NETLOC_SCHEMES

   This is a :class:`set` of schemes explicitly do not support network
   resolution, such as "mailto" and "urn".