1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488
|
=========
Changelog
=========
0.23.2 (2026-03-10)
-------------------
* JSON files in :ref:`test fixtures <web-poet-testing>` are now saved using
UTF-8 instead of the system encoding.
0.23.1 (2026-01-27)
-------------------
* :func:`@field <web_poet.fields.field>` no longer strips docstrings
from decorated methods.
0.23.0 (2026-01-22)
-------------------
* Dropped Python 3.9 support.
* Added :func:`~web_poet.annotation_encode` (see :ref:`input-annotations`) and
:func:`~web_poet.annotation_decode`.
* Implemented type hint improvements.
0.22.0 (2025-12-15)
-------------------
* :ref:`Tests <web-poet-testing>` now put expected and actual values into
:ref:`pytest user properties <web-poet-testing-user-props>`.
0.21.0 (2025-11-24)
-------------------
* Added :class:`~web_poet.pages.BrowserPage` page object class to work with
:class:`~web_poet.page_inputs.browser.BrowserResponse`.
* Added :attr:`BrowserResponse.text
<web_poet.page_inputs.browser.BrowserResponse.text>` attribute.
0.20.0 (2025-10-28)
-------------------
* Added support for Python 3.14.
* Added support for :class:`~.BrowserResponse`, :class:`~.AnyResponse` and
:class:`~.BrowserHtml` dependencies to the :ref:`testing framework
<web-poet-testing>`.
* Explicitly re-export public names.
0.19.2 (2025-08-22)
-------------------
* Fixed runtime resolving of type annotations for some types.
0.19.1 (2025-08-13)
-------------------
* Improved type annotations.
0.19.0 (2025-06-06)
-------------------
* Removed some deprecated code:
* The ``web_poet.overrides`` module is removed.
* The ``ItemWebPage``, ``OverrideRule`` and ``PageObjectRegistry`` classes
are removed.
* The ``from_override_rules()`` class method and the ``get_overrides()`` and
``search_overrides()`` methods of :class:`~web_poet.rules.RulesRegistry`
are removed.
* The ``overrides`` parameter of
:meth:`~web_poet.rules.RulesRegistry.handle_urls` is removed.
* The ``RequestUrl`` and ``ResponseUrl`` classes can no longer be imported
from ``web_poet.page_inputs.http``.
* :ref:`Tests <web-poet-testing>` now support items with
:class:`~web_poet.page_inputs.url.RequestUrl` and
:class:`~web_poet.page_inputs.url.ResponseUrl` objects.
* Improved the :ref:`pytest plugin <web-poet-testing-pytest>`:
* Pytest ≥ 7.0.0 is now required.
* Tests within a test case can now be run individually.
* Tests are now compatible with `vscode-python`_.
.. _vscode-python: https://github.com/microsoft/vscode-python
* Fixed an error of :func:`~web_poet.pages.is_injectable` with
:class:`~types.GenericAlias` on Python ≤ 3.10.
0.18.0 (2025-01-30)
-------------------
* Removed support for Python 3.8, added support for Python 3.13.
* The minimum required version of :doc:`url-matcher <url-matcher:index>`
changed from ``0.2.0`` to ``0.4.0``.
* ``type(None)`` is no longer considered injectable.
* Added :meth:`RulesRegistry.top_rules_for_item()
<web_poet.rules.RulesRegistry.top_rules_for_item>`.
0.17.1 (2024-10-11)
-------------------
* :attr:`web_poet.mixins.SelectableMixin.selector` is now created with the
``base_url`` value set to ``self.url`` if this attribute exists.
* Added a mention of the :doc:`form2request library <form2request:index>` to
the :class:`~.HttpRequest` documentation.
* CI improvements.
0.17.0 (2024-03-04)
-------------------
* Now requires ``andi >= 0.5.0``.
* Package requirements that were unversioned now have minimum versions
specified.
* Added support for Python 3.12.
* Added support for ``typing.Annotated`` dependencies to the serialization and
testing code.
* Documentation improvements.
* CI improvements.
0.16.0 (2024-01-23)
-------------------
* Added new :class:`~.AnyResponse` which holds either :class:`~.BrowserResponse`,
or :class:`~.HttpResponse`.
* Documentation improvements.
0.15.1 (2023-11-21)
-------------------
* ``HttpRequestHeaders`` now has a ``from_bytes_dict`` class method, like
``HttpResponseHeaders``.
0.15.0 (2023-09-11)
-------------------
* A new dependency, :class:`~.Stats`, has been added. It allows storing
key-value data pairs for different purposes. See :ref:`stats`.
0.14.0 (2023-08-03)
-------------------
* Dropped Python 3.7 support.
* Now requires ``packaging >= 20.0``.
* Fixed detection of the :class:`~.Returns` base class.
* Improved docs.
* Updated type hints.
* Updated CI tools.
0.13.1 (2023-05-30)
-------------------
* Fixed an issue with :class:`~.HttpClient` which happens when a response with
a non-standard status code is received.
0.13.0 (2023-05-30)
-------------------
* A new dependency :class:`~.BrowserResponse` has been added. It contains a
browser-rendered page URL, status code and HTML.
* The :ref:`rules` documentation section has been rewritten.
0.12.0 (2023-05-05)
-------------------
* The :ref:`testing framework <web-poet-testing>` now allows defining a
:ref:`custom item adapter <web-poet-testing-adapters>`.
* We have made a backward-incompatible change on test fixture serialization:
the ``type_name`` field of exceptions has been renamed to ``import_path``.
* Fixed built-in Python types, e.g. ``int``, not working as :ref:`field
processors <field-processors>`.
0.11.0 (2023-04-24)
-------------------
* JMESPath_ support is now available: you can use :meth:`.WebPage.jmespath` and
:meth:`.HttpResponse.jmespath` to run queries on JSON responses.
* The testing framework now supports page objects that raise exceptions from
the ``to_item`` method.
.. _JMESPath: https://jmespath.org/
0.10.0 (2023-04-19)
-------------------
* New class :class:`~.Extractor` can be used for easier extraction of nested
fields (see :ref:`default-processors-nested`).
* Exceptions raised while getting a response for an additional request are now
saved in :ref:`test fixtures <web-poet-testing-additional-requests>`.
* Multiple documentation improvements and fixes.
* Add a ``twine check`` CI check.
0.9.0 (2023-03-30)
------------------
* Standardized :ref:`input validation <input-validation>`.
* :ref:`Field processors <field-processors>` can now also be defined through a
nested ``Processors`` class, so that field redefinitions in subclasses can
inherit them. See :ref:`default-processors`.
* :ref:`Field processors <field-processors>` can now opt in to receive the page
object whose field is being read.
* :class:`web_poet.fields.FieldsMixin` now keeps fields from all base classes
when using multiple inheritance.
* Fixed the documentation build.
0.8.1 (2023-03-03)
------------------
* Fix the error when calling :meth:`.to_item() <web_poet.pages.ItemPage.to_item>`,
:func:`item_from_fields_sync() <web_poet.fields.item_from_fields_sync>`, or
:func:`item_from_fields() <web_poet.fields.item_from_fields>` on page objects
defined as slotted attrs classes, while setting ``skip_nonitem_fields=True``.
0.8.0 (2023-02-23)
------------------
This release contains many improvements to the web-poet testing framework,
as well as some other improvements and bug fixes.
Backward-incompatible changes:
* :func:`~.cached_method` no longer caches exceptions for ``async def`` methods.
This makes the behavior the same for sync and async methods, and also makes
it consistent with Python's stdlib caching (i.e. :func:`functools.lru_cache`,
:func:`functools.cached_property`).
* The testing framework now uses the ``HttpResponse-info.json`` file name instead
of ``HttpResponse-other.json`` to store information about HttpResponse
instances. To make tests generated with older web-poet work, rename
these files on disk.
Testing framework improvements:
* Improved test reporting: better diffs and error messages.
* By default, the pytest plugin now generates a test per item attribute
(see :ref:`web-poet-testing-pytest`). There is also an option
(``--web-poet-test-per-item``) to run a test per item instead.
* Page objects with the :class:`~.HttpClient` dependency are now supported
(see :ref:`web-poet-testing-additional-requests`).
* Page objects with the :class:`~.PageParams` dependency are now supported.
* Added a new ``python -m web_poet.testing rerun`` command
(see :ref:`web-poet-testing-tdd`).
* Fixed support for nested (indirect) dependencies in page objects.
Previously they were not handled properly by the testing
framework.
* Non-ASCII output is now stored without escaping in the test fixtures,
for better readability.
Other changes:
* Testing and CI fixes.
* Fixed a packaging issue: ``tests`` and ``tests_extra`` packages were
installed, not just ``web_poet``.
0.7.2 (2023-02-01)
------------------
* Restore the minimum version of ``itemadapter`` from 0.7.1 to 0.7.0, and
prevent a similar issue from happening again in the future.
0.7.1 (2023-02-01)
------------------
* Updated the :ref:`tutorial <tutorial>` to cover recent features and focus on
best practices. Also, a new module was added, :mod:`web_poet.example`, that
allows using page objects while following the tutorial.
* :ref:`web-poet-testing` now covers :ref:`Git LFS <git-lfs>` and
:ref:`scrapy-poet <web-poet-testing-scrapy-poet>`, and recommends
``python -m pytest`` instead of ``pytest``.
* Improved the warning message when duplicate ``ApplyRule`` objects are found.
* ``HttpResponse-other.json`` content is now indented for better readability.
* Improved test coverage for :ref:`fields <fields>`.
0.7.0 (2023-01-18)
------------------
* Add :ref:`a framework for creating tests and running them with pytest
<web-poet-testing>`.
* Support implementing fields in mixin classes.
* Introduce new methods for :class:`web_poet.rules.RulesRegistry`:
* :meth:`web_poet.rules.RulesRegistry.add_rule`
* :meth:`web_poet.rules.RulesRegistry.overrides_for`
* :meth:`web_poet.rules.RulesRegistry.page_cls_for_item`
* Improved the performance of :meth:`web_poet.rules.RulesRegistry.search` where
passing a single parameter of either ``instead_of`` or ``to_return`` results
in *O(1)* look-up time instead of *O(N)*. Additionally, having either
``instead_of`` or ``to_return`` present in multi-parameter search calls would
filter the initial candidate results resulting in a faster search.
* Support :ref:`page object dependency serialization <dep-serialization>`.
* Add new dependencies used in testing and serialization code: ``andi``,
``python-dateutil``, and ``time-machine``. Also ``backports.zoneinfo`` on
non-Windows platforms when the Python version is older than 3.9.
0.6.0 (2022-11-08)
------------------
In this release, the ``@handle_urls`` decorator gets an overhaul; it's not
required anymore to pass another Page Object class to
``@handle_urls("...", overrides=...)``.
Also, the ``@web_poet.field`` decorator gets support for output processing
functions, via the ``out`` argument.
Full list of changes:
* **Backwards incompatible** ``PageObjectRegistry`` is no longer supporting
dict-like access.
* Official support for Python 3.11.
* New ``@web_poet.field(out=[...])`` argument which allows to set output
processing functions for web-poet fields.
* The ``web_poet.overrides`` module is deprecated and replaced with
``web_poet.rules``.
* The ``@handle_urls`` decorator is now creating ``ApplyRule`` instances
instead of ``OverrideRule`` instances; ``OverrideRule`` is deprecated.
``ApplyRule`` is similar to ``OverrideRule``, but has the following differences:
* ``ApplyRule`` accepts a ``to_return`` parameter, which should be the data
container (item) class that the Page Object returns.
* Passing a string to ``for_patterns`` would auto-convert it into
``url_matcher.Patterns``.
* All arguments are now keyword-only except for ``for_patterns``.
* New signature and behavior of ``handle_urls``:
* The ``overrides`` parameter is made optional and renamed to
``instead_of``.
* If defined, the item class declared in a subclass of
``web_poet.ItemPage`` is used as the ``to_return`` parameter of
``ApplyRule``.
* Multiple ``handle_urls`` annotations are allowed.
* ``PageObjectRegistry`` is replaced with ``RulesRegistry``; its API is changed:
* **backwards incompatible** dict-like API is removed;
* **backwards incompatible** *O(1)* lookups using
``.search(use=PagObject)`` has become *O(N)*;
* ``search_overrides`` method is renamed to ``search``;
* ``get_overrides`` method is renamed to ``get_rules``;
* ``from_override_rules`` method is deprecated;
use ``RulesRegistry(rules=...)`` instead.
* Typing improvements.
* Documentation, test, and warning message improvements.
Deprecations:
* The ``web_poet.overrides`` module is deprecated. Use ``web_poet.rules`` instead.
* The ``overrides`` parameter from ``@handle_urls`` is now deprecated.
Use the ``instead_of`` parameter instead.
* The ``OverrideRule`` class is now deprecated. Use ``ApplyRule`` instead.
* ``PageObjectRegistry`` is now deprecated. Use ``RulesRegistry`` instead.
* The ``from_override_rules`` method of ``PageObjectRegistry`` is now deprecated.
Use ``RulesRegistry(rules=...)`` instead.
* The ``PageObjectRegistry.get_overrides`` method is deprecated.
Use ``PageObjectRegistry.get_rules`` instead.
* The ``PageObjectRegistry.search_overrides`` method is deprecated.
Use ``PageObjectRegistry.search`` instead.
0.5.1 (2022-09-23)
------------------
* The BOM encoding from the response body is now read before the response
headers when deriving the response encoding.
* Minor typing improvements.
0.5.0 (2022-09-21)
------------------
Web-poet now includes a mini-framework for organizing extraction code
as Page Object properties::
import attrs
from web_poet import field, ItemPage
@attrs.define
class MyItem:
foo: str
bar: list[str]
class MyPage(ItemPage[MyItem]):
@field
def foo(self):
return "..."
@field
def bar(self):
return ["...", "..."]
**Backwards incompatible changes**:
* ``web_poet.ItemPage`` is no longer an abstract base class which requires
``to_item`` method to be implemented. Instead, it provides a default
``async def to_item`` method implementation which uses fields marked as
``web_poet.field`` to create an item. This change shouldn't affect the
user code in a backwards incompatible way, but it might affect typing.
Deprecations:
* ``web_poet.ItemWebPage`` is deprecated. Use ``web_poet.WebPage`` instead.
Other changes:
* web-poet is declared as PEP 561 package which provides typing information;
mypy is going to use it by default.
* Documentation, test, typing and CI improvements.
0.4.0 (2022-07-26)
------------------
* New ``HttpResponse.urljoin`` method, which take page's base url in account.
* New ``HttpRequest.urljoin`` method.
* standardized ``web_poet.exceptions.Retry`` exception, which allows
to initiate a retry from the Page Object, e.g. based on page content.
* Documentation improvements.
0.3.0 (2022-06-14)
------------------
* Backwards Incompatible Change:
* ``web_poet.requests.request_backend_var``
is renamed to ``web_poet.requests.request_downloader_var``.
* Documentation and CI improvements.
0.2.0 (2022-06-10)
------------------
* Backward Incompatible Change:
* ``ResponseData`` is replaced with ``HttpResponse``.
``HttpResponse`` exposes methods useful for web scraping
(such as xpath and css selectors, json loading),
and handles web page encoding detection. There are also new
types like ``HttpResponseBody`` and ``HttpResponseHeaders``.
* Added support for performing additional requests using
``web_poet.HttpClient``.
* Introduced ``web_poet.BrowserHtml`` dependency
* Introduced ``web_poet.PageParams`` to pass arbitrary information
inside a Page Object.
* Added ``web_poet.handle_urls`` decorator, which allows to declare which
websites should be handled by the page objects. Lower-level
``PageObjectRegistry`` class is also available.
* removed support for Python 3.6
* added support for Python 3.10
0.1.1 (2021-06-02)
------------------
* ``base_url`` and ``urljoin`` shortcuts
0.1.0 (2020-07-18)
------------------
* Documentation
* WebPage, ItemPage, ItemWebPage, Injectable and ResponseData are available
as top-level imports (e.g. ``web_poet.ItemPage``)
0.0.1 (2020-04-27)
------------------
Initial release.
|