1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134
|
.. image:: /assets/mechanical-soup-logo.png
:alt: MechanicalSoup. A Python library for automating website interaction.
Home page
---------
https://mechanicalsoup.readthedocs.io/
Overview
--------
A Python library for automating interaction with websites.
MechanicalSoup automatically stores and sends cookies, follows
redirects, and can follow links and submit forms. It doesn't do
JavaScript.
MechanicalSoup was created by `M Hickford
<https://github.com/hickford/>`__, who was a fond user of the
`Mechanize <https://github.com/jjlee/mechanize>`__ library.
Unfortunately, Mechanize was `incompatible with Python 3 until 2019
<https://github.com/python-mechanize/mechanize/issues/9>`__ and its development
stalled for several years. MechanicalSoup provides a similar API, built on Python
giants `Requests <http://docs.python-requests.org/en/latest/>`__ (for
HTTP sessions) and `BeautifulSoup
<https://www.crummy.com/software/BeautifulSoup/>`__ (for document
navigation). Since 2017 it is a project actively maintained by a small
team including `@hemberger <https://github.com/hemberger>`__ and `@moy
<https://github.com/moy/>`__.
|Gitter Chat|
Installation
------------
|Latest Version| |Supported Versions|
PyPy3 is also supported (and tested against).
Download and install the latest released version from `PyPI <https://pypi.python.org/pypi/MechanicalSoup/>`__::
pip install MechanicalSoup
Download and install the development version from `GitHub <https://github.com/MechanicalSoup/MechanicalSoup>`__::
pip install git+https://github.com/MechanicalSoup/MechanicalSoup
Installing from source (installs the version in the current working directory)::
python setup.py install
(In all cases, add ``--user`` to the ``install`` command to
install in the current user's home directory.)
Documentation
-------------
The full documentation is available on
https://mechanicalsoup.readthedocs.io/. You may want to jump directly to
the `automatically generated API
documentation <https://mechanicalsoup.readthedocs.io/en/stable/mechanicalsoup.html>`__.
Example
-------
From `<examples/expl_qwant.py>`__, code to get the results from
a Qwant search:
.. code:: python
"""Example usage of MechanicalSoup to get the results from the Qwant
search engine.
"""
import re
import mechanicalsoup
import html
import urllib.parse
# Connect to Qwant
browser = mechanicalsoup.StatefulBrowser(user_agent='MechanicalSoup')
browser.open("https://lite.qwant.com/")
# Fill-in the search form
browser.select_form('#search-form')
browser["q"] = "MechanicalSoup"
browser.submit_selected()
# Display the results
for link in browser.page.select('.result a'):
# Qwant shows redirection links, not the actual URL, so extract
# the actual URL from the redirect link:
href = link.attrs['href']
m = re.match(r"^/redirect/[^/]*/(.*)$", href)
if m:
href = urllib.parse.unquote(m.group(1))
print(link.text, '->', href)
More examples are available in `<examples/>`__.
For an example with a more complex form (checkboxes, radio buttons and
textareas), read `<tests/test_browser.py>`__
and `<tests/test_form.py>`__.
Development
-----------
|Build Status|
|Coverage Status|
|Documentation Status|
|CII Best Practices|
Instructions for building, testing and contributing to MechanicalSoup:
see `<CONTRIBUTING.rst>`__.
Common problems
---------------
Read the `FAQ
<https://mechanicalsoup.readthedocs.io/en/stable/faq.html>`__.
.. |Latest Version| image:: https://img.shields.io/pypi/v/MechanicalSoup.svg
:target: https://pypi.python.org/pypi/MechanicalSoup/
.. |Supported Versions| image:: https://img.shields.io/pypi/pyversions/mechanicalsoup.svg
:target: https://pypi.python.org/pypi/MechanicalSoup/
.. |Build Status| image:: https://github.com/MechanicalSoup/MechanicalSoup/actions/workflows/python-package.yml/badge.svg?branch=main
:target: https://github.com/MechanicalSoup/MechanicalSoup/actions/workflows/python-package.yml?query=branch%3Amain
.. |Coverage Status| image:: https://codecov.io/gh/MechanicalSoup/MechanicalSoup/branch/main/graph/badge.svg
:target: https://codecov.io/gh/MechanicalSoup/MechanicalSoup
.. |Documentation Status| image:: https://readthedocs.org/projects/mechanicalsoup/badge/?version=latest
:target: https://mechanicalsoup.readthedocs.io/en/latest/?badge=latest
.. |CII Best Practices| image:: https://bestpractices.coreinfrastructure.org/projects/1334/badge
:target: https://bestpractices.coreinfrastructure.org/projects/1334
.. |Gitter Chat| image:: https://badges.gitter.im/MechanicalSoup/MechanicalSoup.svg
:target: https://gitter.im/MechanicalSoup/Lobby
|