File: CHANGELOG.rst

package info (click to toggle)
rapidfuzz 3.12.2%2Bds-1
links: PTS, VCS
area: main
in suites: forky, sid, trixie
size: 2,436 kB
sloc: python: 7,571; cpp: 7,481; sh: 30; makefile: 23
file content (1234 lines) | stat: -rw-r--r-- 33,550 bytes
Changelog
---------

[3.12.2] - 2025-03-02
^^^^^^^^^^^^^^^^^^^^^
Added
~~~~~
* added wheels for pypy 3.11

Changed
~~~~~~~
* upgrade to ``Cython==3.0.12``

[3.12.1] - 2025-01-30
^^^^^^^^^^^^^^^^^^^^^
Fixed
~~~~~
* fix version number


[3.12.0] - 2025-01-16
^^^^^^^^^^^^^^^^^^^^^
Changed
~~~~~~~~~~~
* generate code for fallback imports to be better parseable for tools bundling Python
  applications into a single binary (examples are cx-freeze and pyinstaller)

Added
~~~~~
* added support for taskflow 3.9.0

[3.11.0] - 2024-12-17
^^^^^^^^^^^^^^^^^^^^^
Performance
~~~~~~~~~~~
* improve calculation of min score inside partial_ratio so it can skip more alignments

Added
~~~~~
* added build support for emscripten

[3.10.1] - 2024-10-24
^^^^^^^^^^^^^^^^^^^^^
Fixed
~~~~~
* fix compilation on clang-19
* fix incorrect results in simd optimized implementation of Levenshtein and OSA on 32bit targets

Added
~~~~~
* added support for taskflow 3.8.0

[3.10.0] - 2024-09-21
^^^^^^^^^^^^^^^^^^^^^
Fixed
~~~~~
* drop support for Python 3.8
* switch build system to ``scikit-build-core``

[3.9.7] - 2024-09-02
^^^^^^^^^^^^^^^^^^^^
Fixed
~~~~~
* fix crash in ``cdist`` due to Visual Studio upgrade

[3.9.6] - 2024-08-06
^^^^^^^^^^^^^^^^^^^^
Changed
~~~~~~~
* upgrade to ``Cython==3.0.11``
* add python 3.13 wheels

[3.9.5] - 2024-07-29
^^^^^^^^^^^^^^^^^^^^
Fixed
~~~~~
* include simd binaries in pyinstaller builds
* fix builds with setuptools 72 by upgrading ``scikit-build``

[3.9.4] - 2024-07-02
^^^^^^^^^^^^^^^^^^^^
Fixed
~~~~~
* fix bug in ``Levenshtein.editops`` and ``Levenshtein.opcodes`` which could lead
  to incorrect results and crashes for some inputs

[3.9.3] - 2024-05-31
^^^^^^^^^^^^^^^^^^^^
Fixed
~~~~~
* fix None handling for queries in ``process.cdist`` for scorers not supporting SIMD

[3.9.2] - 2024-05-28
^^^^^^^^^^^^^^^^^^^^
Fixed
~~~~~
* fix supported versions of taskflow in cmake to be in the range v3.3 - v3.7

[3.9.1] - 2024-05-19
^^^^^^^^^^^^^^^^^^^^
Fixed
~~~~~
* disable AVX2 on MacOS since it did lead to illegal instructions being generated

[3.9.0] - 2024-05-02
^^^^^^^^^^^^^^^^^^^^
Changed
~~~~~~~
* significantly improve type hints for the library

Fixed
~~~~~
* fix cmake version parsing

[3.8.1] - 2024-04-07
^^^^^^^^^^^^^^^^^^^^
Fixed
~~~~~
* use the correct version of ``rapidfuzz-cpp`` when building against a system installed version

[3.8.0] - 2024-04-06
^^^^^^^^^^^^^^^^^^^^
Added
~~~~~
* added ``process.cpdist`` which allows pairwise comparison of two collection of inputs

Fixed
~~~~~
* fix some minor errors in the type hints
* fix potentially incorrect results of JaroWinkler when using high prefix weights

[3.7.0] - 2024-03-21
^^^^^^^^^^^^^^^^^^^^
Changed
~~~~~~~
* reduce importtime

[3.6.2] - 2024-03-05
^^^^^^^^^^^^^^^^^^^^

Changed
~~~~~~~
* upgrade to ``Cython==3.0.9``

Fixed
~~~~~
* upgrade ``rapidfuzz-cpp`` which includes a fix for build issues on some compilers
* fix some issues with the sphinx config

[3.6.1] - 2023-12-28
^^^^^^^^^^^^^^^^^^^^
Fixed
~~~~~
* fix overflow error on systems with ``sizeof(size_t) < 8``

[3.6.0] - 2023-12-26
^^^^^^^^^^^^^^^^^^^^
Fixed
~~~~~
* fix pure python fallback implementation of ``fuzz.token_set_ratio``
* properly link with ``-latomic`` if ``std::atomic<uint64_t>`` is not natively supported

Performance
~~~~~~~~~~~
* add banded implementation of LCS / Indel. This improves the runtime from ``O((|s1|/64) * |s2|)`` to ``O((score_cutoff/64) * |s2|)``

Changed
~~~~~~~
* upgrade to ``Cython==3.0.7``
* cdist for many metrics now returns a matrix of ``uint32`` instead of ``int32`` by default

[3.5.2] - 2023-11-02
^^^^^^^^^^^^^^^^^^^^
Fixed
~~~~~
* use _mm_malloc/_mm_free on macOS if aligned_alloc is unsupported

[3.5.1] - 2023-10-31
^^^^^^^^^^^^^^^^^^^^
Fixed
~~~~~
* fix compilation failure on macOS

[3.5.0] - 2023-10-31
^^^^^^^^^^^^^^^^^^^^
Changed
~~~~~~~
* skip pandas ``pd.NA`` similar to ``None``
* add ``score_multiplier`` argument to ``process.cdist`` which allows multiplying the end result scores
  with a constant factor.
* drop support for Python 3.7

Performance
~~~~~~~~~~~
* improve performance of simd implementation for ``LCS`` / ``Indel`` / ``Jaro`` / ``JaroWinkler``
* improve performance of Jaro and Jaro Winkler for long sequences
* implement ``process.extract`` with ``limit=1`` using ``process.extractOne`` which can be faster

Fixed
~~~~~
* the preprocessing function was always called through Python due to a broken C-API version check
* fix wraparound issue in simd implementation of Jaro and Jaro Winkler

[3.4.0] - 2023-10-09
^^^^^^^^^^^^^^^^^^^^
Changed
~~~~~~~
* upgrade to ``Cython==3.0.3``
* add simd implementation for Jaro and Jaro Winkler

[3.3.1] - 2023-09-25
^^^^^^^^^^^^^^^^^^^^
Added
~~~~~
* add missing tag for python 3.12 support

[3.3.0] - 2023-09-11
^^^^^^^^^^^^^^^^^^^^
Changed
~~~~~~~
* upgrade to ``Cython==3.0.2``
* implement the remaining missing features from the C++ implementation in the pure Python implementation

Added
~~~~~
* added support for Python 3.12

[3.2.0] - 2023-08-02
^^^^^^^^^^^^^^^^^^^^
Changed
~~~~~~~
* build x86 with sse2/avx2 runtime detection

[3.1.2] - 2023-07-19
^^^^^^^^^^^^^^^^^^^^
Changed
~~~~~~~
* upgrade to ``Cython==3.0.0``

[3.1.1] - 2023-06-06
^^^^^^^^^^^^^^^^^^^^
Changed
~~~~~~~
* upgrade to ``taskflow==3.6``

Fixed
~~~~~
* replace usage of ``isnan`` with ``std::isnan`` which fixes the build on NetBSD

[3.1.0] - 2023-06-02
^^^^^^^^^^^^^^^^^^^^
Changed
~~~~~~~
* added keyword argument ``pad`` to Hamming distance. This controls whether sequences of different
  length should be padded or lead to a ``ValueError``
* improve consistency of exception messages between the C++ and pure Python implementation
* upgrade required Cython version to ``Cython==3.0.0b3``

Fixed
~~~~~
* fix missing GIL restore when an exception is thrown inside ``process.cdist``
* fix incorrect type hints for the ``process`` module

[3.0.0] - 2023-04-16
^^^^^^^^^^^^^^^^^^^^
Changed
~~~~~~~
* allow the usage of ``Hamming`` for different string lengths. Length differences are handled as
  insertions / deletions
* remove support for boolean preprocessor functions in ``rapidfuzz.fuzz`` and ``rapidfuzz.process``.
  The processor argument is now always a callable or ``None``.
* update defaults of the processor argument to be ``None`` everywhere. For affected functions this can change results, since strings are no longer preprocessed.
  To get back the old behaviour pass ``processor=utils.default_process`` to these functions.
  The following functions are affected by this:

  * ``process.extract``, ``process.extract_iter``, ``process.extractOne``
  * ``fuzz.token_sort_ratio``, ``fuzz.token_set_ratio``, ``fuzz.token_ratio``, ``fuzz.partial_token_sort_ratio``, ``fuzz.partial_token_set_ratio``, ``fuzz.partial_token_ratio``, ``fuzz.WRatio``, ``fuzz.QRatio``

* ``rapidfuzz.process`` no longer calls scorers with ``processor=None``. For this reason user provided scorers no longer require this argument.
* remove option to pass keyword arguments to scorer via ``**kwargs`` in ``rapidfuzz.process``. They can be passed
  via a ``scorer_kwargs`` argument now. This ensures this does not break when extending function parameters and
  prevents naming clashes.
* remove ``rapidfuzz.string_metric`` module. Replacements for all functions are available in ``rapidfuzz.distance``

Added
~~~~~
* added support for arbitrary hashable sequence in the pure Python fallback implementation of all functions in ``rapidfuzz.distance``
* added support for ``None`` and ``float("nan")`` in ``process.cdist`` as long as the underlying scorer supports it.
  This is the case for all scorers returning normalized results.

Fixed
~~~~~
* fix division by zero in simd implementation of normalized metrics leading to incorrect results

[2.15.1] - 2023-04-11
^^^^^^^^^^^^^^^^^^^^^
Fixed
~~~~~
* fix incorrect tag dispatching implementation leading to AVX2 instructions in the SSE2 code path

Added
~~~~~
* add wheels for windows arm64

[2.15.0] - 2023-04-01
^^^^^^^^^^^^^^^^^^^^^
Changed
~~~~~~~
* allow the usage of finite generators as choices in ``process.extract``

[2.14.0] - 2023-03-31
^^^^^^^^^^^^^^^^^^^^^
Changed
~~~~~~~
* upgrade required Cython version to ``Cython==3.0.0b2``

Fixed
~~~~~
* fix handling of non symmetric scorers in pure python version of ``process.cdist``
* fix default dtype handling when using ``process.cdist`` with pure python scorers

[2.13.7] - 2022-12-20
^^^^^^^^^^^^^^^^^^^^^
Fixed
~~~~~~~
* fix function signature of ``get_requires_for_build_wheel``

[2.13.6] - 2022-12-11
^^^^^^^^^^^^^^^^^^^^^
Changed
~~~~~~~
* reformat changelog as restructured text to get rig of ``m2r2`` dependency


[2.13.5] - 2022-12-11
^^^^^^^^^^^^^^^^^^^^^
Added
~~~~~
* added docs to sdist

Fixed
~~~~~
* fix two cases of undefined behavior in ``process.cdist``

[2.13.4] - 2022-12-08
^^^^^^^^^^^^^^^^^^^^^
Changed
~~~~~~~
* handle ``float("nan")`` similar to ``None`` for query / choice, since this is common for
  non-existent data in tools like numpy

Fixed
~~~~~
* fix handling on ``None``\ /\ ``float("nan")`` in ``process.distance``
* use absolute imports inside tests

[2.13.3] - 2022-12-03
^^^^^^^^^^^^^^^^^^^^^
Fixed
~~~~~
* improve handling of functions wrapped using ``functools.wraps``
* fix broken fallback to Python implementation when the a ``ImportError`` occurs on import.
  This can e.g. occur when the binary has a dependency on libatomic, but it is unavailable on
  the system
* define ``CMAKE_C_COMPILER_AR``\ /\ ``CMAKE_CXX_COMPILER_AR``\ /\ ``CMAKE_C_COMPILER_RANLIB``\ /\ ``CMAKE_CXX_COMPILER_RANLIB``
  if they are not defined yet

[2.13.2] - 2022-11-05
^^^^^^^^^^^^^^^^^^^^^
Fixed
~~~~~
* fix incorrect results in ``Hamming.normalized_similarity``
* fix incorrect score_cutoff handling in pure python implementation of
  ``Postfix.normalized_distance`` and ``Prefix.normalized_distance``
* fix ``Levenshtein.normalized_similarity`` and ``Levenshtein.normalized_distance``
  when used in combination with the process module
* ``fuzz.partial_ratio`` was not always symmetric when ``len(s1) == len(s2)``

[2.13.1] - 2022-11-02
^^^^^^^^^^^^^^^^^^^^^
Fixed
~~~~~
* fix bug in ``normalized_similarity`` of most scorers,
  leading to incorrect results when used in combination with the process module
* fix sse2 support
* fix bug in ``JaroWinkler`` and ``Jaro`` when used in the pure python process module
* forward kwargs in pure Python implementation of ``process.extract``

[2.13.0] - 2022-10-30
^^^^^^^^^^^^^^^^^^^^^
Fixed
~~~~~
* fix bug in ``Levenshtein.editops`` leading to crashes when used with ``score_hint``

Changed
~~~~~~~
* moved capi from ``rapidfuzz_capi`` into ``rapidfuzz``\ , since it will always
  succeed the installation now that there is a pure Python mode
* add ``score_hint`` argument to process module
* add ``score_hint`` argument to Levenshtein module

[2.12.0] - 2022-10-24
^^^^^^^^^^^^^^^^^^^^^
Changed
~~~~~~~
* drop support for Python 3.6

Added
~~~~~
* added ``Prefix``\ /\ ``Suffix`` similarity

Fixed
~~~~~
* fixed packaging with pyinstaller

[2.11.1] - 2022-10-05
^^^^^^^^^^^^^^^^^^^^^
Fixed
~~~~~
* Fix segmentation fault in ``process.cdist`` when used with an empty query sequence

[2.11.0] - 2022-10-02
^^^^^^^^^^^^^^^^^^^^^
Changes
~~~~~~~
* move jarowinkler dependency into rapidfuzz to simplify maintenance

Performance
~~~~~~~~~~~
* add SIMD implementation for ``fuzz.ratio``\ /\ ``fuzz.QRatio``\ /\ ``Levenshtein``\ /\ ``Indel``\ /\ ``LCSseq``\ /\ ``OSA`` to improve
  performance for short strings in cdist

[2.10.3] - 2022-09-30
^^^^^^^^^^^^^^^^^^^^^
Fixed
~~~~~
* use ``scikit-build=0.14.1`` on Linux, since ``scikit-build=0.15.0`` fails to find the Python Interpreter
* workaround gcc in bug in template type deduction

[2.10.2] - 2022-09-27
^^^^^^^^^^^^^^^^^^^^^
Fixed
~~~~~
* fix support for cmake versions below 3.17

[2.10.1] - 2022-09-25
^^^^^^^^^^^^^^^^^^^^^
Changed
~~~~~~~
* modernize cmake build to fix most conda-forge builds

[2.10.0] - 2022-09-18
^^^^^^^^^^^^^^^^^^^^^
Added
~~~~~
* add editops to hamming distance

Performance
~~~~~~~~~~~
* strip common affix in osa distance

Fixed
~~~~~
* ignore missing pandas in Python 3.11 tests

[2.9.0] - 2022-09-16
^^^^^^^^^^^^^^^^^^^^
Added
~~~~~
* add optimal string alignment (OSA)

[2.8.0] - 2022-09-11
^^^^^^^^^^^^^^^^^^^^
Fixed
~~~~~
* ``fuzz.partial_ratio`` did not find the optimal alignment in some edge cases (#219)

Performance
~~~~~~~~~~~
* improve performance of ``fuzz.partial_ratio``

Changed
~~~~~~~
* increased minimum C++ version to C++17 (see #255)

[2.7.0] - 2022-09-11
^^^^^^^^^^^^^^^^^^^^
Performance
~~~~~~~~~~~
* improve performance of ``Levenshtein.distance``\ /\ ``Levenshtein.editops`` for
  long sequences.

Added
~~~~~
* add ``score_hint`` parameter to ``Levenshtein.editops`` which allows the use of a
  faster implementation

Changed
~~~~~~~
* all functions in the ``string_metric`` module do now raise a deprecation warning.
  They are now only wrappers for their replacement functions, which makes them slower
  when used with the process module

[2.6.1] - 2022-09-03
^^^^^^^^^^^^^^^^^^^^
Fixed
~~~~~
* fix incorrect results of partial_ratio for long needles (#257)

[2.6.0] - 2022-08-20
^^^^^^^^^^^^^^^^^^^^
Fixed
~~~~~
* fix hashing for custom classes

Added
~~~~~
* add support for slicing in ``Editops.__getitem__``\ /\ ``Editops.__delitem__``
* add ``DamerauLevenshtein`` module

[2.5.0] - 2022-08-14
^^^^^^^^^^^^^^^^^^^^
Added
~~~~~
* added support for KeyboardInterrupt in processor module
  It might still take a bit until the KeyboardInterrupt is registered, but
  no longer runs all text comparisons after pressing ``Ctrl + C``

Fixed
~~~~~
* fix default scorer used by cdist to use C++ implementation if possible

[2.4.4] - 2022-08-12
^^^^^^^^^^^^^^^^^^^^
Changed
~~~~~~~
* Added support for Python 3.11

[2.4.3] - 2022-08-08
^^^^^^^^^^^^^^^^^^^^
Fixed
~~~~~
* fix value range of ``jaro_similarity``\ /\ ``jaro_winkler_similarity`` in the pure Python mode
  for the string_metric module
* fix missing atomix symbol on arm 32 bit

[2.4.2] - 2022-07-30
^^^^^^^^^^^^^^^^^^^^
Fixed
~~~~~
* add missing symbol to pure Python which made the usage impossible

[2.4.1] - 2022-07-29
^^^^^^^^^^^^^^^^^^^^
Fixed
~~~~~
* fix version number

[2.4.0] - 2022-07-29
^^^^^^^^^^^^^^^^^^^^
Fixed
~~~~~
* fix banded Levenshtein implementation

Performance
~~~~~~~~~~~
* improve performance and memory usage of ``Levenshtein.editops``

  * memory usage is reduced from O(NM) to O(N)
  * performance is improved for long sequences

[2.3.0] - 2022-07-23
^^^^^^^^^^^^^^^^^^^^
Added
~~~~~
* add ``as_matching_blocks`` to ``Editops``\ /\ ``Opcodes``
* add support for deletions from ``Editops``
* add ``Editops.apply``\ /\ ``Opcodes.apply``
* add ``Editops.remove_subsequence``

Changed
~~~~~~~
* merge adjacent similar blocks in ``Opcodes``

Fixed
~~~~~
* fix usage of ``eval(repr(Editop))``\ , ``eval(repr(Editops))``\ , ``eval(repr(Opcode))`` and ``eval(repr(Opcodes))``
* fix opcode conversion for empty source sequence
* fix validation for empty Opcode list passed into ``Opcodes.__init__``

[2.2.0] - 2022-07-19
^^^^^^^^^^^^^^^^^^^^
Changed
~~~~~~~
* added in-tree build backend to install cmake and ninja only when it is not installed yet
  and only when wheels are available

[2.1.4] - 2022-07-17
^^^^^^^^^^^^^^^^^^^^
Changed
~~~~~~~
* changed internal implementation of cdist to remove build dependency to numpy

Added
~~~~~
* added wheels for musllinux and manylinux ppc64le, s390x

[2.1.3] - 2022-07-09
^^^^^^^^^^^^^^^^^^^^
Fixed
~~~~~
* fix missing type stubs

[2.1.2] - 2022-07-04
^^^^^^^^^^^^^^^^^^^^
Changed
~~~~~~~
* change src layout to make package import from root directory possible

[2.1.1] - 2022-06-30
^^^^^^^^^^^^^^^^^^^^
Changed
~~~~~~~
* allow installation without the C++ extension if it fails to compile
* allow selection of implementation via the environment variable ``RAPIDFUZZ_IMPLEMENTATION``
  which can be set to "cpp" or "python"

[2.1.0] - 2022-06-29
^^^^^^^^^^^^^^^^^^^^
Added
~~~~~
* added pure python fallback for all implementations with the following exceptions:

  * no support for sequences of hashables. Only strings supported so far
  * ``\*.editops`` / ``\*.opcodes`` functions not implemented yet
  * process.cdist does not support multithreading

Fixed
~~~~~
* fuzz.partial_ratio_alignment ignored the score_cutoff
* fix implementation of Hamming.normalized_similarity
* fix default score_cutoff of Hamming.similarity
* fix implementation of LCSseq.distance when used in the process module
* treat hash for -1 and -2 as different

[2.0.15] - 2022-06-24
^^^^^^^^^^^^^^^^^^^^^
Fixed
~~~~~
* fix integer wraparound in partial_ratio/partial_ratio_alignment

[2.0.14] - 2022-06-23
^^^^^^^^^^^^^^^^^^^^^
Fixed
~~~~~
* fix unlimited recursion in LCSseq when used in combination with the process module

Changed
~~~~~~~
* add fallback implementations of ``taskflow``\ , ``rapidfuzz-cpp`` and ``jarowinkler-cpp``
  back to wheel, since some package building systems like piwheels can't clone sources

[2.0.13] - 2022-06-22
^^^^^^^^^^^^^^^^^^^^^
Changed
~~~~~~~
* use system version of cmake on arm platforms, since the cmake package fails to compile

[2.0.12] - 2022-06-22
^^^^^^^^^^^^^^^^^^^^^
Changed
~~~~~~~
* add tests to sdist
* remove cython dependency for sdist

[2.0.11] - 2022-04-23
^^^^^^^^^^^^^^^^^^^^^
Changed
~~~~~~~
* relax version requirements of dependencies to simplify packaging

[2.0.10] - 2022-04-17
^^^^^^^^^^^^^^^^^^^^^
Fixed
~~~~~
* Do not include installations of jaro_winkler in wheels (regression from 2.0.7)

Changed
~~~~~~~
* Allow installation from system installed versions of ``rapidfuzz-cpp``\ , ``jarowinkler-cpp``
  and ``taskflow``

Added
~~~~~
* Added PyPy3.9 wheels on Linux

[2.0.9] - 2022-04-07
^^^^^^^^^^^^^^^^^^^^
Fixed
~~~~~
* Add missing Cython code in sdist
* consider float imprecision in score_cutoff (see #210)

[2.0.8] - 2022-04-07
^^^^^^^^^^^^^^^^^^^^
Fixed
~~~~~
* fix incorrect score_cutoff handling in token_set_ratio and token_ratio

Added
~~~~~
* add longest common subsequence

[2.0.7] - 2022-03-13
^^^^^^^^^^^^^^^^^^^^
Fixed
~~~~~
* Do not include installations of jaro_winkler and taskflow in wheels

[2.0.6] - 2022-03-06
^^^^^^^^^^^^^^^^^^^^
Fixed
~~~~~
* fix incorrect population of sys.modules which lead to submodules overshadowing
  other imports

Changed
~~~~~~~
* moved JaroWinkler and Jaro into a separate package

[2.0.5] - 2022-02-25
^^^^^^^^^^^^^^^^^^^^
Fixed
~~~~~
* fix signed integer overflow inside hashmap implementation

[2.0.4] - 2022-02-21
^^^^^^^^^^^^^^^^^^^^
Fixed
~~~~~
* fix binary size increase due to debug symbols
* fix segmentation fault in ``Levenshtein.editops``

[2.0.3] - 2022-02-18
^^^^^^^^^^^^^^^^^^^^
Added
~~~~~
* Added fuzz.partial_ratio_alignment, which returns the result of fuzz.partial_ratio
  combined with the alignment this result stems from

Fixed
~~~~~
* Fix Indel distance returning incorrect result when using score_cutoff=1, when the strings
  are not equal. This affected other scorers like fuzz.WRatio, which use the Indel distance
  as well.

[2.0.2] - 2022-02-12
^^^^^^^^^^^^^^^^^^^^
Fixed
~~~~~
* fix type hints
* Add back transpiled cython files to the sdist to simplify builds in package builders
  like FreeBSD port build or conda-forge

[2.0.1] - 2022-02-11
^^^^^^^^^^^^^^^^^^^^
Fixed
~~~~~
* fix type hints
* Indel.normalized_similarity mistakenly used the implementation of Indel.normalized_distance

[2.0.0] - 2022-02-09
^^^^^^^^^^^^^^^^^^^^
Added
~~~~~
* added C-Api which can be used to extend RapidFuzz from different Python modules using any
  programming language which allows the usage of C-Apis (C/C++/Rust)
* added new scorers in ``rapidfuzz.distance.*``

  * port existing distances to this new api
  * add Indel distance along with the corresponding editops function

Changed
~~~~~~~
* when the result of ``string_metric.levenshtein`` or ``string_metric.hamming`` is below max
  they do now return ``max + 1`` instead of -1
* Build system moved from setuptools to scikit-build
* Stop including all modules in __init__.py, since they significantly slowed down import time

Removed
~~~~~~~
* remove the ``rapidfuzz.levenshtein`` module which was deprecated in v1.0.0 and scheduled for removal in v2.0.0
* dropped support for Python2.7 and Python3.5

Deprecated
~~~~~~~~~~
* deprecate support to specify processor in form of a boolean (will be removed in v3.0.0)

  * new functions will not get support for this in the first place

* deprecate ``rapidfuzz.string_metric`` (will be removed in v3.0.0). Similar scorers are available
  in ``rapidfuzz.distance.*``

Fixed
~~~~~
* process.cdist did raise an exception when used with a pure python scorer

Performance
~~~~~~~~~~~
* improve performance and memory usage of ``rapidfuzz.string_metric.levenshtein_editops``

  * memory usage is reduced by 33%
  * performance is improved by around 10%-20%

* significantly improve performance of  ``rapidfuzz.string_metric.levenshtein`` for ``max <= 31``
  using a banded implementation

[1.9.1] - 2021-12-13
^^^^^^^^^^^^^^^^^^^^
Fixed
~~~~~
* fix bug in new editops implementation, causing it to SegFault on some inputs (see qurator-spk/dinglehopper#64)

[1.9.0] - 2021-12-11
^^^^^^^^^^^^^^^^^^^^
Fixed
~~~~~
* Fix some issues in the type annotations (see #163)

Performance
~~~~~~~~~~~
* improve performance and memory usage of ``rapidfuzz.string_metric.levenshtein_editops``

  * memory usage is reduced by 10x
  * performance is improved from ``O(N * M)`` to ``O([N / 64] * M)``

[1.8.3] - 2021-11-19
^^^^^^^^^^^^^^^^^^^^
Added
~~~~~
* Added missing wheels for Python3.6 on MacOs and Windows (see #159)

[1.8.2] - 2021-10-27
^^^^^^^^^^^^^^^^^^^^
Added
~~~~~
* Add wheels for Python 3.10 on MacOs

[1.8.1] - 2021-10-22
^^^^^^^^^^^^^^^^^^^^
Fixed
~~~~~
* Fix incorrect editops results (See #148)

[1.8.0] - 2021-10-20
^^^^^^^^^^^^^^^^^^^^
Changed
~~~~~~~
* Add Wheels for Python3.10 on all platforms except MacOs (see #141)
* Improve performance of ``string_metric.jaro_similarity`` and  ``string_metric.jaro_winkler_similarity`` for strings with a length <= 64

[1.7.1] - 2021-10-02
^^^^^^^^^^^^^^^^^^^^
Fixed
~~~~~
* fixed incorrect results of fuzz.partial_ratio for long needles (see #138)

[1.7.0] - 2021-09-27
^^^^^^^^^^^^^^^^^^^^
Changed
~~~~~~~
* Added typing for process.cdist
* Added multithreading support to cdist using the argument ``process.cdist``
* Add dtype argument to ``process.cdist`` to set the dtype of the result numpy array (see #132)
* Use a better hash collision strategy in the internal hashmap, which improves the worst case performance

[1.6.2] - 2021-09-15
^^^^^^^^^^^^^^^^^^^^
Changed
~~~~~~~
* improved performance of fuzz.ratio
* only import process.cdist when numpy is available

[1.6.1] - 2021-09-11
^^^^^^^^^^^^^^^^^^^^
Changed
~~~~~~~
* Add back wheels for Python2.7

[1.6.0] - 2021-09-10
^^^^^^^^^^^^^^^^^^^^
Changed
~~~~~~~
* fuzz.partial_ratio uses a new implementation for short needles (<= 64). This implementation is

  * more accurate than the current implementation (it is guaranteed to find the optimal alignment)
  * it is significantly faster

* Add process.cdist to compare all elements of two lists (see #51)

[1.5.1] - 2021-09-01
^^^^^^^^^^^^^^^^^^^^
Fixed
~~~~~
* Fix out of bounds access in levenshtein_editops

[1.5.0] - 2021-08-21
^^^^^^^^^^^^^^^^^^^^
Changed
~~~~~~~
* all scorers do now support similarity/distance calculations between any sequence of hashables. So it is possible to calculate e.g. the WER as:
  .. code-block::

     >>> string_metric.levenshtein(["word1", "word2"], ["word1", "word3"])
     1

Added
~~~~~
* Added type stub files for all functions
* added jaro similarity in ``string_metric.jaro_similarity``
* added jaro winkler similarity in ``string_metric.jaro_winkler_similarity``
* added Levenshtein editops in ``string_metric.levenshtein_editops``

Fixed
~~~~~
* Fixed support for set objects in ``process.extract``
* Fixed inconsistent handling of empty strings

[1.4.1] - 2021-03-30
^^^^^^^^^^^^^^^^^^^^
Performance
~~~~~~~~~~~
* improved performance of result creation in process.extract

Fixed
~~~~~
* Cython ABI stability issue (#95)
* fix missing decref in case of exceptions in process.extract

[1.4.0] - 2021-03-29
^^^^^^^^^^^^^^^^^^^^
Changed
~~~~~~~
* added processor support to ``levenshtein`` and ``hamming``
* added distance support to extract/extractOne/extract_iter

Fixed
~~~~~
* incorrect results of ``normalized_hamming`` and ``normalized_levenshtein`` when used with ``utils.default_process`` as processor

[1.3.3] - 2021-03-20
^^^^^^^^^^^^^^^^^^^^
Fixed
~~~~~
* Fix a bug in the mbleven implementation of the uniform Levenshtein distance and cover it with fuzz tests

[1.3.2] - 2021-03-20
^^^^^^^^^^^^^^^^^^^^
Fixed
~~~~~
* some of the newly activated warnings caused build failures in the conda-forge build

[1.3.1] - 2021-03-20
^^^^^^^^^^^^^^^^^^^^
Fixed
~~~~~
* Fixed issue in LCS calculation for partial_ratio (see #90)
* Fixed incorrect results for normalized_hamming and normalized_levenshtein when the processor ``utils.default_process`` is used
* Fix many compiler warnings

[1.3.0] - 2021-03-16
^^^^^^^^^^^^^^^^^^^^
Changed
~~~~~~~
* add wheels for a lot of new platforms
* drop support for Python 2.7

Performance
~~~~~~~~~~~
* use ``is`` instead of ``==`` to compare functions directly by address

Fixed
~~~~~
* Fix another ref counting issue
* Fix some issues in the Levenshtein distance algorithm (see #92)

[1.2.1] - 2021-03-08
^^^^^^^^^^^^^^^^^^^^
Performance
~~~~~~~~~~~
* further improve bitparallel implementation of uniform Levenshtein distance for strings with a length > 64 (in many cases more than 50% faster)

[1.2.0] - 2021-03-07
^^^^^^^^^^^^^^^^^^^^
Changed
~~~~~~~
* add more benchmarks to documentation

Performance
~~~~~~~~~~~
* add bitparallel implementation to InDel Distance (Levenshtein with the weights 1,1,2) for strings with a length > 64
* improve bitparallel implementation of uniform Levenshtein distance for strings with a length > 64
* use the InDel Distance and uniform Levenshtein distance in more cases instead of the generic implementation
* Directly use the Levenshtein implementation in C++ instead of using it through Python in process.*

[1.1.2] - 2021-03-03
^^^^^^^^^^^^^^^^^^^^
Fixed
~~~~~
* Fix reference counting in process.extract (see #81)

[1.1.1] - 2021-02-23
^^^^^^^^^^^^^^^^^^^^
Fixed
~~~~~
* Fix result conversion in process.extract (see #79)

[1.1.0] - 2021-02-21
^^^^^^^^^^^^^^^^^^^^
Changed
~~~~~~~
* string_metric.normalized_levenshtein supports now all weights
* when different weights are used for Insertion and Deletion the strings are not swapped inside the Levenshtein implementation anymore. So different weights for Insertion and Deletion are now supported.
* replace C++ implementation with a Cython implementation. This has the following advantages:

  * The implementation is less error prone, since a lot of the complex things are done by Cython
  * slightly faster than the current implementation (up to 10% for some parts)
  * about 33% smaller binary size
  * reduced compile time

* Added \*\*kwargs argument to process.extract/extractOne/extract_iter that is passed to the scorer
* Add max argument to hamming distance
* Add support for whole Unicode range to utils.default_process

Performance
~~~~~~~~~~~
* replaced Wagner Fischer usage in the normal Levenshtein distance with a bitparallel implementation

[1.0.2] - 2021-02-19
^^^^^^^^^^^^^^^^^^^^
Fixed
~~~~~
* The bitparallel LCS algorithm in fuzz.partial_ratio did not find the longest common substring properly in some cases.
  The old algorithm is used again until this bug is fixed.

[1.0.1] - 2021-02-17
^^^^^^^^^^^^^^^^^^^^
Changed
~~~~~~~
* string_metric.normalized_levenshtein supports now the weights (1, 1, N) with N >= 1

Performance
~~~~~~~~~~~
* The Levenshtein distance with the weights (1, 1, >2) do now use the same implementation as the weight (1, 1, 2), since
  ``Substitution > Insertion + Deletion`` has no effect

Fixed
~~~~~
* fix uninitialized variable in bitparallel Levenshtein distance with the weight (1, 1, 1)

[1.0.0] - 2021-02-12
^^^^^^^^^^^^^^^^^^^^
Changed
~~~~~~~
* all normalized string_metrics can now be used as scorer for process.extract/extractOne
* Implementation of the C++ Wrapper completely refactored to make it easier to add more scorers, processors and string matching algorithms in the future.
* increased test coverage, that already helped to fix some bugs and help to prevent regressions in the future
* improved docstrings of functions

Performance
~~~~~~~~~~~
* Added bit-parallel implementation of the Levenshtein distance for the weights (1,1,1) and (1,1,2).
* Added specialized implementation of the Levenshtein distance for cases with a small maximum edit distance, that is even faster, than the bit-parallel implementation.
* Improved performance of ``fuzz.partial_ratio``
  -> Since ``fuzz.ratio`` and ``fuzz.partial_ratio`` are used in most scorers, this improves the overall performance.
* Improved performance of ``process.extract`` and ``process.extractOne``

Deprecated
~~~~~~~~~~
* the ``rapidfuzz.levenshtein`` module is now deprecated and will be removed in v2.0.0
  These functions are now placed in ``rapidfuzz.string_metric``. ``distance``\ , ``normalized_distance``\ , ``weighted_distance`` and ``weighted_normalized_distance`` are combined into ``levenshtein`` and ``normalized_levenshtein``.

Added
~~~~~
* added normalized version of the hamming distance in ``string_metric.normalized_hamming``
* process.extract_iter as a generator, that yields the similarity of all elements, that have a similarity >= score_cutoff

Fixed
~~~~~
* multiple bugs in extractOne when used with a scorer, that's not from RapidFuzz
* fixed bug in ``token_ratio``
* fixed bug in result normalization causing zero division

[0.14.2] - 2020-12-31
^^^^^^^^^^^^^^^^^^^^^
Fixed
~~~~~
* utf8 usage in the copyright header caused problems with python2.7 on some platforms (see #70)

[0.14.1] - 2020-12-13
^^^^^^^^^^^^^^^^^^^^^
Fixed
~~~~~
* when a custom processor like ``lambda s: s`` was used with any of the methods inside fuzz.* it always returned a score of 100. This release fixes this and adds a better test coverage to prevent this bug in the future.

[0.14.0] - 2020-12-09
^^^^^^^^^^^^^^^^^^^^^
Added
~~~~~
* added hamming distance metric in the levenshtein module

Performance
~~~~~~~~~~~
* improved performance of default_process by using lookup table

[0.13.4] - 2020-11-30
^^^^^^^^^^^^^^^^^^^^^
Fixed
~~~~~
* Add missing virtual destructor that caused a segmentation fault on Mac Os

[0.13.3] - 2020-11-21
^^^^^^^^^^^^^^^^^^^^^
Added
~~~~~
* C++11 Support
* manylinux wheels

[0.13.2] - 2020-11-21
^^^^^^^^^^^^^^^^^^^^^
Fixed
~~~~~
* Levenshtein was not imported from __init__
* The reference count of a Python Object inside process.extractOne was decremented to early

[0.13.1] - 2020-11-17
^^^^^^^^^^^^^^^^^^^^^
Performance
~~~~~~~~~~~
* process.extractOne  exits early when a score of 100 is found. This way the other strings do not have to be preprocessed anymore.

[0.13.0] - 2020-11-16
^^^^^^^^^^^^^^^^^^^^^
Fixed
~~~~~
* string objects passed to scorers had to be strings even before preprocessing them. This was changed, so they only have to be strings after preprocessing similar to process.extract/process.extractOne

Performance
~~~~~~~~~~~
* process.extractOne is now implemented in C++ making it a lot faster
* When token_sort_ratio or partial_token_sort ratio is used inprocess.extractOne the words in the query are only sorted once to improve the runtime

Changed
~~~~~~~
* process.extractOne/process.extract do now return the index of the match, when the choices are a list.

Removed
~~~~~~~
* process.extractIndices got removed, since the indices are now already returned by process.extractOne/process.extract

[0.12.5] - 2020-10-26
^^^^^^^^^^^^^^^^^^^^^
Fixed
~~~~~
* fix documentation of process.extractOne (see #48)

[0.12.4] - 2020-10-22
^^^^^^^^^^^^^^^^^^^^^
Added
~~~~~
* Added wheels for

  * CPython 2.7 on windows 64 bit
  * CPython 2.7 on windows 32 bit
  * PyPy 2.7 on windows 32 bit

[0.12.3] - 2020-10-09
^^^^^^^^^^^^^^^^^^^^^
Fixed
~~~~~
* fix bug in partial_ratio (see #43)

[0.12.2] - 2020-10-01
^^^^^^^^^^^^^^^^^^^^^
Fixed
~~~~~
* fix inconsistency with fuzzywuzzy in partial_ratio when using strings of equal length

[0.12.1] - 2020-09-30
^^^^^^^^^^^^^^^^^^^^^
Fixed
~~~~~
* MSVC has a bug and therefore crashed on some of the templates used. This Release simplifies the templates so compiling on msvc works again

[0.12.0] - 2020-09-30
^^^^^^^^^^^^^^^^^^^^^
Performance
~~~~~~~~~~~
* partial_ratio is using the Levenshtein distance now, which is a lot faster. Since many of the other algorithms use partial_ratio, this helps to improve the overall performance

[0.11.3] - 2020-09-22
^^^^^^^^^^^^^^^^^^^^^
Fixed
~~~~~
* fix partial_token_set_ratio returning 100 all the time

[0.11.2] - 2020-09-12
^^^^^^^^^^^^^^^^^^^^^
Added
~~~~~
* added rapidfuzz.__author__, rapidfuzz.__license__ and rapidfuzz.__version__

[0.11.1] - 2020-09-01
^^^^^^^^^^^^^^^^^^^^^
Fixed
~~~~~
* do not use auto junk when searching the optimal alignment for partial_ratio

[0.11.0] - 2020-08-22
^^^^^^^^^^^^^^^^^^^^^
Changed
~~~~~~~
* support for python 2.7 added #40
* add wheels for python2.7 (both pypy and cpython) on MacOS and Linux

[0.10.0] - 2020-08-17
^^^^^^^^^^^^^^^^^^^^^
Changed
~~~~~~~
* added wheels for Python3.9

Fixed
~~~~~
* tuple scores in process.extractOne are now supported #39