1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72
|
Welcome to w3lib's documentation!
=================================
Overview
========
This is a Python library of web-related functions, such as:
* remove comments, or tags from HTML snippets
* extract base url from HTML snippets
* translate entities on HTML strings
* convert raw HTTP headers to dicts and vice-versa
* construct HTTP auth header
* converting HTML pages to unicode
* sanitize urls (like browsers do)
* extract arguments from urls
The w3lib library is licensed under the BSD license.
Modules
=======
.. toctree::
:maxdepth: 4
w3lib
Requirements
============
Python 3.9+
Install
=======
``pip install w3lib``
Tests
=====
:doc:`pytest <pytest:index>` is the preferred way to run tests. Just run:
``pytest`` from the root directory to execute tests using the default Python
interpreter.
:doc:`tox <tox:index>` could be used to run tests for all supported Python
versions. Install it (using 'pip install tox') and then run ``tox`` from
the root directory - tests will be executed for all available
Python interpreters.
Changelog
=========
.. include:: ../NEWS
:start-line: 3
History
-------
The code of w3lib was originally part of the :doc:`Scrapy framework
<scrapy:index>` but was later stripped out of Scrapy, with the aim of make it
more reusable and to provide a useful library of web functions without
depending on Scrapy.
Indices and tables
==================
* :ref:`genindex`
* :ref:`modindex`
* :ref:`search`
|