File: CHANGELOG.rst

package info (click to toggle)
rapidfuzz 3.12.2%2Bds-1
  • links: PTS, VCS
  • area: main
  • in suites: forky, sid, trixie
  • size: 2,436 kB
  • sloc: python: 7,571; cpp: 7,481; sh: 30; makefile: 23
file content (1234 lines) | stat: -rw-r--r-- 33,550 bytes parent folder | download
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
1001
1002
1003
1004
1005
1006
1007
1008
1009
1010
1011
1012
1013
1014
1015
1016
1017
1018
1019
1020
1021
1022
1023
1024
1025
1026
1027
1028
1029
1030
1031
1032
1033
1034
1035
1036
1037
1038
1039
1040
1041
1042
1043
1044
1045
1046
1047
1048
1049
1050
1051
1052
1053
1054
1055
1056
1057
1058
1059
1060
1061
1062
1063
1064
1065
1066
1067
1068
1069
1070
1071
1072
1073
1074
1075
1076
1077
1078
1079
1080
1081
1082
1083
1084
1085
1086
1087
1088
1089
1090
1091
1092
1093
1094
1095
1096
1097
1098
1099
1100
1101
1102
1103
1104
1105
1106
1107
1108
1109
1110
1111
1112
1113
1114
1115
1116
1117
1118
1119
1120
1121
1122
1123
1124
1125
1126
1127
1128
1129
1130
1131
1132
1133
1134
1135
1136
1137
1138
1139
1140
1141
1142
1143
1144
1145
1146
1147
1148
1149
1150
1151
1152
1153
1154
1155
1156
1157
1158
1159
1160
1161
1162
1163
1164
1165
1166
1167
1168
1169
1170
1171
1172
1173
1174
1175
1176
1177
1178
1179
1180
1181
1182
1183
1184
1185
1186
1187
1188
1189
1190
1191
1192
1193
1194
1195
1196
1197
1198
1199
1200
1201
1202
1203
1204
1205
1206
1207
1208
1209
1210
1211
1212
1213
1214
1215
1216
1217
1218
1219
1220
1221
1222
1223
1224
1225
1226
1227
1228
1229
1230
1231
1232
1233
1234
Changelog
---------

[3.12.2] - 2025-03-02
^^^^^^^^^^^^^^^^^^^^^
Added
~~~~~
* added wheels for pypy 3.11

Changed
~~~~~~~
* upgrade to ``Cython==3.0.12``

[3.12.1] - 2025-01-30
^^^^^^^^^^^^^^^^^^^^^
Fixed
~~~~~
* fix version number


[3.12.0] - 2025-01-16
^^^^^^^^^^^^^^^^^^^^^
Changed
~~~~~~~~~~~
* generate code for fallback imports to be better parseable for tools bundling Python
  applications into a single binary (examples are cx-freeze and pyinstaller)

Added
~~~~~
* added support for taskflow 3.9.0

[3.11.0] - 2024-12-17
^^^^^^^^^^^^^^^^^^^^^
Performance
~~~~~~~~~~~
* improve calculation of min score inside partial_ratio so it can skip more alignments

Added
~~~~~
* added build support for emscripten

[3.10.1] - 2024-10-24
^^^^^^^^^^^^^^^^^^^^^
Fixed
~~~~~
* fix compilation on clang-19
* fix incorrect results in simd optimized implementation of Levenshtein and OSA on 32bit targets

Added
~~~~~
* added support for taskflow 3.8.0

[3.10.0] - 2024-09-21
^^^^^^^^^^^^^^^^^^^^^
Fixed
~~~~~
* drop support for Python 3.8
* switch build system to ``scikit-build-core``

[3.9.7] - 2024-09-02
^^^^^^^^^^^^^^^^^^^^
Fixed
~~~~~
* fix crash in ``cdist`` due to Visual Studio upgrade

[3.9.6] - 2024-08-06
^^^^^^^^^^^^^^^^^^^^
Changed
~~~~~~~
* upgrade to ``Cython==3.0.11``
* add python 3.13 wheels

[3.9.5] - 2024-07-29
^^^^^^^^^^^^^^^^^^^^
Fixed
~~~~~
* include simd binaries in pyinstaller builds
* fix builds with setuptools 72 by upgrading ``scikit-build``

[3.9.4] - 2024-07-02
^^^^^^^^^^^^^^^^^^^^
Fixed
~~~~~
* fix bug in ``Levenshtein.editops`` and ``Levenshtein.opcodes`` which could lead
  to incorrect results and crashes for some inputs

[3.9.3] - 2024-05-31
^^^^^^^^^^^^^^^^^^^^
Fixed
~~~~~
* fix None handling for queries in ``process.cdist`` for scorers not supporting SIMD

[3.9.2] - 2024-05-28
^^^^^^^^^^^^^^^^^^^^
Fixed
~~~~~
* fix supported versions of taskflow in cmake to be in the range v3.3 - v3.7

[3.9.1] - 2024-05-19
^^^^^^^^^^^^^^^^^^^^
Fixed
~~~~~
* disable AVX2 on MacOS since it did lead to illegal instructions being generated

[3.9.0] - 2024-05-02
^^^^^^^^^^^^^^^^^^^^
Changed
~~~~~~~
* significantly improve type hints for the library

Fixed
~~~~~
* fix cmake version parsing

[3.8.1] - 2024-04-07
^^^^^^^^^^^^^^^^^^^^
Fixed
~~~~~
* use the correct version of ``rapidfuzz-cpp`` when building against a system installed version

[3.8.0] - 2024-04-06
^^^^^^^^^^^^^^^^^^^^
Added
~~~~~
* added ``process.cpdist`` which allows pairwise comparison of two collection of inputs

Fixed
~~~~~
* fix some minor errors in the type hints
* fix potentially incorrect results of JaroWinkler when using high prefix weights

[3.7.0] - 2024-03-21
^^^^^^^^^^^^^^^^^^^^
Changed
~~~~~~~
* reduce importtime

[3.6.2] - 2024-03-05
^^^^^^^^^^^^^^^^^^^^

Changed
~~~~~~~
* upgrade to ``Cython==3.0.9``

Fixed
~~~~~
* upgrade ``rapidfuzz-cpp`` which includes a fix for build issues on some compilers
* fix some issues with the sphinx config

[3.6.1] - 2023-12-28
^^^^^^^^^^^^^^^^^^^^
Fixed
~~~~~
* fix overflow error on systems with ``sizeof(size_t) < 8``

[3.6.0] - 2023-12-26
^^^^^^^^^^^^^^^^^^^^
Fixed
~~~~~
* fix pure python fallback implementation of ``fuzz.token_set_ratio``
* properly link with ``-latomic`` if ``std::atomic<uint64_t>`` is not natively supported

Performance
~~~~~~~~~~~
* add banded implementation of LCS / Indel. This improves the runtime from ``O((|s1|/64) * |s2|)`` to ``O((score_cutoff/64) * |s2|)``

Changed
~~~~~~~
* upgrade to ``Cython==3.0.7``
* cdist for many metrics now returns a matrix of ``uint32`` instead of ``int32`` by default

[3.5.2] - 2023-11-02
^^^^^^^^^^^^^^^^^^^^
Fixed
~~~~~
* use _mm_malloc/_mm_free on macOS if aligned_alloc is unsupported

[3.5.1] - 2023-10-31
^^^^^^^^^^^^^^^^^^^^
Fixed
~~~~~
* fix compilation failure on macOS

[3.5.0] - 2023-10-31
^^^^^^^^^^^^^^^^^^^^
Changed
~~~~~~~
* skip pandas ``pd.NA`` similar to ``None``
* add ``score_multiplier`` argument to ``process.cdist`` which allows multiplying the end result scores
  with a constant factor.
* drop support for Python 3.7

Performance
~~~~~~~~~~~
* improve performance of simd implementation for ``LCS`` / ``Indel`` / ``Jaro`` / ``JaroWinkler``
* improve performance of Jaro and Jaro Winkler for long sequences
* implement ``process.extract`` with ``limit=1`` using ``process.extractOne`` which can be faster

Fixed
~~~~~
* the preprocessing function was always called through Python due to a broken C-API version check
* fix wraparound issue in simd implementation of Jaro and Jaro Winkler

[3.4.0] - 2023-10-09
^^^^^^^^^^^^^^^^^^^^
Changed
~~~~~~~
* upgrade to ``Cython==3.0.3``
* add simd implementation for Jaro and Jaro Winkler

[3.3.1] - 2023-09-25
^^^^^^^^^^^^^^^^^^^^
Added
~~~~~
* add missing tag for python 3.12 support

[3.3.0] - 2023-09-11
^^^^^^^^^^^^^^^^^^^^
Changed
~~~~~~~
* upgrade to ``Cython==3.0.2``
* implement the remaining missing features from the C++ implementation in the pure Python implementation

Added
~~~~~
* added support for Python 3.12

[3.2.0] - 2023-08-02
^^^^^^^^^^^^^^^^^^^^
Changed
~~~~~~~
* build x86 with sse2/avx2 runtime detection

[3.1.2] - 2023-07-19
^^^^^^^^^^^^^^^^^^^^
Changed
~~~~~~~
* upgrade to ``Cython==3.0.0``

[3.1.1] - 2023-06-06
^^^^^^^^^^^^^^^^^^^^
Changed
~~~~~~~
* upgrade to ``taskflow==3.6``

Fixed
~~~~~
* replace usage of ``isnan`` with ``std::isnan`` which fixes the build on NetBSD

[3.1.0] - 2023-06-02
^^^^^^^^^^^^^^^^^^^^
Changed
~~~~~~~
* added keyword argument ``pad`` to Hamming distance. This controls whether sequences of different
  length should be padded or lead to a ``ValueError``
* improve consistency of exception messages between the C++ and pure Python implementation
* upgrade required Cython version to ``Cython==3.0.0b3``

Fixed
~~~~~
* fix missing GIL restore when an exception is thrown inside ``process.cdist``
* fix incorrect type hints for the ``process`` module

[3.0.0] - 2023-04-16
^^^^^^^^^^^^^^^^^^^^
Changed
~~~~~~~
* allow the usage of ``Hamming`` for different string lengths. Length differences are handled as
  insertions / deletions
* remove support for boolean preprocessor functions in ``rapidfuzz.fuzz`` and ``rapidfuzz.process``.
  The processor argument is now always a callable or ``None``.
* update defaults of the processor argument to be ``None`` everywhere. For affected functions this can change results, since strings are no longer preprocessed.
  To get back the old behaviour pass ``processor=utils.default_process`` to these functions.
  The following functions are affected by this:

  * ``process.extract``, ``process.extract_iter``, ``process.extractOne``
  * ``fuzz.token_sort_ratio``, ``fuzz.token_set_ratio``, ``fuzz.token_ratio``, ``fuzz.partial_token_sort_ratio``, ``fuzz.partial_token_set_ratio``, ``fuzz.partial_token_ratio``, ``fuzz.WRatio``, ``fuzz.QRatio``

* ``rapidfuzz.process`` no longer calls scorers with ``processor=None``. For this reason user provided scorers no longer require this argument.
* remove option to pass keyword arguments to scorer via ``**kwargs`` in ``rapidfuzz.process``. They can be passed
  via a ``scorer_kwargs`` argument now. This ensures this does not break when extending function parameters and
  prevents naming clashes.
* remove ``rapidfuzz.string_metric`` module. Replacements for all functions are available in ``rapidfuzz.distance``

Added
~~~~~
* added support for arbitrary hashable sequence in the pure Python fallback implementation of all functions in ``rapidfuzz.distance``
* added support for ``None`` and ``float("nan")`` in ``process.cdist`` as long as the underlying scorer supports it.
  This is the case for all scorers returning normalized results.

Fixed
~~~~~
* fix division by zero in simd implementation of normalized metrics leading to incorrect results

[2.15.1] - 2023-04-11
^^^^^^^^^^^^^^^^^^^^^
Fixed
~~~~~
* fix incorrect tag dispatching implementation leading to AVX2 instructions in the SSE2 code path

Added
~~~~~
* add wheels for windows arm64

[2.15.0] - 2023-04-01
^^^^^^^^^^^^^^^^^^^^^
Changed
~~~~~~~
* allow the usage of finite generators as choices in ``process.extract``

[2.14.0] - 2023-03-31
^^^^^^^^^^^^^^^^^^^^^
Changed
~~~~~~~
* upgrade required Cython version to ``Cython==3.0.0b2``

Fixed
~~~~~
* fix handling of non symmetric scorers in pure python version of ``process.cdist``
* fix default dtype handling when using ``process.cdist`` with pure python scorers

[2.13.7] - 2022-12-20
^^^^^^^^^^^^^^^^^^^^^
Fixed
~~~~~~~
* fix function signature of ``get_requires_for_build_wheel``

[2.13.6] - 2022-12-11
^^^^^^^^^^^^^^^^^^^^^
Changed
~~~~~~~
* reformat changelog as restructured text to get rig of ``m2r2`` dependency


[2.13.5] - 2022-12-11
^^^^^^^^^^^^^^^^^^^^^
Added
~~~~~
* added docs to sdist

Fixed
~~~~~
* fix two cases of undefined behavior in ``process.cdist``

[2.13.4] - 2022-12-08
^^^^^^^^^^^^^^^^^^^^^
Changed
~~~~~~~
* handle ``float("nan")`` similar to ``None`` for query / choice, since this is common for
  non-existent data in tools like numpy

Fixed
~~~~~
* fix handling on ``None``\ /\ ``float("nan")`` in ``process.distance``
* use absolute imports inside tests

[2.13.3] - 2022-12-03
^^^^^^^^^^^^^^^^^^^^^
Fixed
~~~~~
* improve handling of functions wrapped using ``functools.wraps``
* fix broken fallback to Python implementation when the a ``ImportError`` occurs on import.
  This can e.g. occur when the binary has a dependency on libatomic, but it is unavailable on
  the system
* define ``CMAKE_C_COMPILER_AR``\ /\ ``CMAKE_CXX_COMPILER_AR``\ /\ ``CMAKE_C_COMPILER_RANLIB``\ /\ ``CMAKE_CXX_COMPILER_RANLIB``
  if they are not defined yet

[2.13.2] - 2022-11-05
^^^^^^^^^^^^^^^^^^^^^
Fixed
~~~~~
* fix incorrect results in ``Hamming.normalized_similarity``
* fix incorrect score_cutoff handling in pure python implementation of
  ``Postfix.normalized_distance`` and ``Prefix.normalized_distance``
* fix ``Levenshtein.normalized_similarity`` and ``Levenshtein.normalized_distance``
  when used in combination with the process module
* ``fuzz.partial_ratio`` was not always symmetric when ``len(s1) == len(s2)``

[2.13.1] - 2022-11-02
^^^^^^^^^^^^^^^^^^^^^
Fixed
~~~~~
* fix bug in ``normalized_similarity`` of most scorers,
  leading to incorrect results when used in combination with the process module
* fix sse2 support
* fix bug in ``JaroWinkler`` and ``Jaro`` when used in the pure python process module
* forward kwargs in pure Python implementation of ``process.extract``

[2.13.0] - 2022-10-30
^^^^^^^^^^^^^^^^^^^^^
Fixed
~~~~~
* fix bug in ``Levenshtein.editops`` leading to crashes when used with ``score_hint``

Changed
~~~~~~~
* moved capi from ``rapidfuzz_capi`` into ``rapidfuzz``\ , since it will always
  succeed the installation now that there is a pure Python mode
* add ``score_hint`` argument to process module
* add ``score_hint`` argument to Levenshtein module

[2.12.0] - 2022-10-24
^^^^^^^^^^^^^^^^^^^^^
Changed
~~~~~~~
* drop support for Python 3.6

Added
~~~~~
* added ``Prefix``\ /\ ``Suffix`` similarity

Fixed
~~~~~
* fixed packaging with pyinstaller

[2.11.1] - 2022-10-05
^^^^^^^^^^^^^^^^^^^^^
Fixed
~~~~~
* Fix segmentation fault in ``process.cdist`` when used with an empty query sequence

[2.11.0] - 2022-10-02
^^^^^^^^^^^^^^^^^^^^^
Changes
~~~~~~~
* move jarowinkler dependency into rapidfuzz to simplify maintenance

Performance
~~~~~~~~~~~
* add SIMD implementation for ``fuzz.ratio``\ /\ ``fuzz.QRatio``\ /\ ``Levenshtein``\ /\ ``Indel``\ /\ ``LCSseq``\ /\ ``OSA`` to improve
  performance for short strings in cdist

[2.10.3] - 2022-09-30
^^^^^^^^^^^^^^^^^^^^^
Fixed
~~~~~
* use ``scikit-build=0.14.1`` on Linux, since ``scikit-build=0.15.0`` fails to find the Python Interpreter
* workaround gcc in bug in template type deduction

[2.10.2] - 2022-09-27
^^^^^^^^^^^^^^^^^^^^^
Fixed
~~~~~
* fix support for cmake versions below 3.17

[2.10.1] - 2022-09-25
^^^^^^^^^^^^^^^^^^^^^
Changed
~~~~~~~
* modernize cmake build to fix most conda-forge builds

[2.10.0] - 2022-09-18
^^^^^^^^^^^^^^^^^^^^^
Added
~~~~~
* add editops to hamming distance

Performance
~~~~~~~~~~~
* strip common affix in osa distance

Fixed
~~~~~
* ignore missing pandas in Python 3.11 tests

[2.9.0] - 2022-09-16
^^^^^^^^^^^^^^^^^^^^
Added
~~~~~
* add optimal string alignment (OSA)

[2.8.0] - 2022-09-11
^^^^^^^^^^^^^^^^^^^^
Fixed
~~~~~
* ``fuzz.partial_ratio`` did not find the optimal alignment in some edge cases (#219)

Performance
~~~~~~~~~~~
* improve performance of ``fuzz.partial_ratio``

Changed
~~~~~~~
* increased minimum C++ version to C++17 (see #255)

[2.7.0] - 2022-09-11
^^^^^^^^^^^^^^^^^^^^
Performance
~~~~~~~~~~~
* improve performance of ``Levenshtein.distance``\ /\ ``Levenshtein.editops`` for
  long sequences.

Added
~~~~~
* add ``score_hint`` parameter to ``Levenshtein.editops`` which allows the use of a
  faster implementation

Changed
~~~~~~~
* all functions in the ``string_metric`` module do now raise a deprecation warning.
  They are now only wrappers for their replacement functions, which makes them slower
  when used with the process module

[2.6.1] - 2022-09-03
^^^^^^^^^^^^^^^^^^^^
Fixed
~~~~~
* fix incorrect results of partial_ratio for long needles (#257)

[2.6.0] - 2022-08-20
^^^^^^^^^^^^^^^^^^^^
Fixed
~~~~~
* fix hashing for custom classes

Added
~~~~~
* add support for slicing in ``Editops.__getitem__``\ /\ ``Editops.__delitem__``
* add ``DamerauLevenshtein`` module

[2.5.0] - 2022-08-14
^^^^^^^^^^^^^^^^^^^^
Added
~~~~~
* added support for KeyboardInterrupt in processor module
  It might still take a bit until the KeyboardInterrupt is registered, but
  no longer runs all text comparisons after pressing ``Ctrl + C``

Fixed
~~~~~
* fix default scorer used by cdist to use C++ implementation if possible

[2.4.4] - 2022-08-12
^^^^^^^^^^^^^^^^^^^^
Changed
~~~~~~~
* Added support for Python 3.11

[2.4.3] - 2022-08-08
^^^^^^^^^^^^^^^^^^^^
Fixed
~~~~~
* fix value range of ``jaro_similarity``\ /\ ``jaro_winkler_similarity`` in the pure Python mode
  for the string_metric module
* fix missing atomix symbol on arm 32 bit

[2.4.2] - 2022-07-30
^^^^^^^^^^^^^^^^^^^^
Fixed
~~~~~
* add missing symbol to pure Python which made the usage impossible

[2.4.1] - 2022-07-29
^^^^^^^^^^^^^^^^^^^^
Fixed
~~~~~
* fix version number

[2.4.0] - 2022-07-29
^^^^^^^^^^^^^^^^^^^^
Fixed
~~~~~
* fix banded Levenshtein implementation

Performance
~~~~~~~~~~~
* improve performance and memory usage of ``Levenshtein.editops``

  * memory usage is reduced from O(NM) to O(N)
  * performance is improved for long sequences

[2.3.0] - 2022-07-23
^^^^^^^^^^^^^^^^^^^^
Added
~~~~~
* add ``as_matching_blocks`` to ``Editops``\ /\ ``Opcodes``
* add support for deletions from ``Editops``
* add ``Editops.apply``\ /\ ``Opcodes.apply``
* add ``Editops.remove_subsequence``

Changed
~~~~~~~
* merge adjacent similar blocks in ``Opcodes``

Fixed
~~~~~
* fix usage of ``eval(repr(Editop))``\ , ``eval(repr(Editops))``\ , ``eval(repr(Opcode))`` and ``eval(repr(Opcodes))``
* fix opcode conversion for empty source sequence
* fix validation for empty Opcode list passed into ``Opcodes.__init__``

[2.2.0] - 2022-07-19
^^^^^^^^^^^^^^^^^^^^
Changed
~~~~~~~
* added in-tree build backend to install cmake and ninja only when it is not installed yet
  and only when wheels are available

[2.1.4] - 2022-07-17
^^^^^^^^^^^^^^^^^^^^
Changed
~~~~~~~
* changed internal implementation of cdist to remove build dependency to numpy

Added
~~~~~
* added wheels for musllinux and manylinux ppc64le, s390x

[2.1.3] - 2022-07-09
^^^^^^^^^^^^^^^^^^^^
Fixed
~~~~~
* fix missing type stubs

[2.1.2] - 2022-07-04
^^^^^^^^^^^^^^^^^^^^
Changed
~~~~~~~
* change src layout to make package import from root directory possible

[2.1.1] - 2022-06-30
^^^^^^^^^^^^^^^^^^^^
Changed
~~~~~~~
* allow installation without the C++ extension if it fails to compile
* allow selection of implementation via the environment variable ``RAPIDFUZZ_IMPLEMENTATION``
  which can be set to "cpp" or "python"

[2.1.0] - 2022-06-29
^^^^^^^^^^^^^^^^^^^^
Added
~~~~~
* added pure python fallback for all implementations with the following exceptions:

  * no support for sequences of hashables. Only strings supported so far
  * ``\*.editops`` / ``\*.opcodes`` functions not implemented yet
  * process.cdist does not support multithreading

Fixed
~~~~~
* fuzz.partial_ratio_alignment ignored the score_cutoff
* fix implementation of Hamming.normalized_similarity
* fix default score_cutoff of Hamming.similarity
* fix implementation of LCSseq.distance when used in the process module
* treat hash for -1 and -2 as different

[2.0.15] - 2022-06-24
^^^^^^^^^^^^^^^^^^^^^
Fixed
~~~~~
* fix integer wraparound in partial_ratio/partial_ratio_alignment

[2.0.14] - 2022-06-23
^^^^^^^^^^^^^^^^^^^^^
Fixed
~~~~~
* fix unlimited recursion in LCSseq when used in combination with the process module

Changed
~~~~~~~
* add fallback implementations of ``taskflow``\ , ``rapidfuzz-cpp`` and ``jarowinkler-cpp``
  back to wheel, since some package building systems like piwheels can't clone sources

[2.0.13] - 2022-06-22
^^^^^^^^^^^^^^^^^^^^^
Changed
~~~~~~~
* use system version of cmake on arm platforms, since the cmake package fails to compile

[2.0.12] - 2022-06-22
^^^^^^^^^^^^^^^^^^^^^
Changed
~~~~~~~
* add tests to sdist
* remove cython dependency for sdist

[2.0.11] - 2022-04-23
^^^^^^^^^^^^^^^^^^^^^
Changed
~~~~~~~
* relax version requirements of dependencies to simplify packaging

[2.0.10] - 2022-04-17
^^^^^^^^^^^^^^^^^^^^^
Fixed
~~~~~
* Do not include installations of jaro_winkler in wheels (regression from 2.0.7)

Changed
~~~~~~~
* Allow installation from system installed versions of ``rapidfuzz-cpp``\ , ``jarowinkler-cpp``
  and ``taskflow``

Added
~~~~~
* Added PyPy3.9 wheels on Linux

[2.0.9] - 2022-04-07
^^^^^^^^^^^^^^^^^^^^
Fixed
~~~~~
* Add missing Cython code in sdist
* consider float imprecision in score_cutoff (see #210)

[2.0.8] - 2022-04-07
^^^^^^^^^^^^^^^^^^^^
Fixed
~~~~~
* fix incorrect score_cutoff handling in token_set_ratio and token_ratio

Added
~~~~~
* add longest common subsequence

[2.0.7] - 2022-03-13
^^^^^^^^^^^^^^^^^^^^
Fixed
~~~~~
* Do not include installations of jaro_winkler and taskflow in wheels

[2.0.6] - 2022-03-06
^^^^^^^^^^^^^^^^^^^^
Fixed
~~~~~
* fix incorrect population of sys.modules which lead to submodules overshadowing
  other imports

Changed
~~~~~~~
* moved JaroWinkler and Jaro into a separate package

[2.0.5] - 2022-02-25
^^^^^^^^^^^^^^^^^^^^
Fixed
~~~~~
* fix signed integer overflow inside hashmap implementation

[2.0.4] - 2022-02-21
^^^^^^^^^^^^^^^^^^^^
Fixed
~~~~~
* fix binary size increase due to debug symbols
* fix segmentation fault in ``Levenshtein.editops``

[2.0.3] - 2022-02-18
^^^^^^^^^^^^^^^^^^^^
Added
~~~~~
* Added fuzz.partial_ratio_alignment, which returns the result of fuzz.partial_ratio
  combined with the alignment this result stems from

Fixed
~~~~~
* Fix Indel distance returning incorrect result when using score_cutoff=1, when the strings
  are not equal. This affected other scorers like fuzz.WRatio, which use the Indel distance
  as well.

[2.0.2] - 2022-02-12
^^^^^^^^^^^^^^^^^^^^
Fixed
~~~~~
* fix type hints
* Add back transpiled cython files to the sdist to simplify builds in package builders
  like FreeBSD port build or conda-forge

[2.0.1] - 2022-02-11
^^^^^^^^^^^^^^^^^^^^
Fixed
~~~~~
* fix type hints
* Indel.normalized_similarity mistakenly used the implementation of Indel.normalized_distance

[2.0.0] - 2022-02-09
^^^^^^^^^^^^^^^^^^^^
Added
~~~~~
* added C-Api which can be used to extend RapidFuzz from different Python modules using any
  programming language which allows the usage of C-Apis (C/C++/Rust)
* added new scorers in ``rapidfuzz.distance.*``

  * port existing distances to this new api
  * add Indel distance along with the corresponding editops function

Changed
~~~~~~~
* when the result of ``string_metric.levenshtein`` or ``string_metric.hamming`` is below max
  they do now return ``max + 1`` instead of -1
* Build system moved from setuptools to scikit-build
* Stop including all modules in __init__.py, since they significantly slowed down import time

Removed
~~~~~~~
* remove the ``rapidfuzz.levenshtein`` module which was deprecated in v1.0.0 and scheduled for removal in v2.0.0
* dropped support for Python2.7 and Python3.5

Deprecated
~~~~~~~~~~
* deprecate support to specify processor in form of a boolean (will be removed in v3.0.0)

  * new functions will not get support for this in the first place

* deprecate ``rapidfuzz.string_metric`` (will be removed in v3.0.0). Similar scorers are available
  in ``rapidfuzz.distance.*``

Fixed
~~~~~
* process.cdist did raise an exception when used with a pure python scorer

Performance
~~~~~~~~~~~
* improve performance and memory usage of ``rapidfuzz.string_metric.levenshtein_editops``

  * memory usage is reduced by 33%
  * performance is improved by around 10%-20%

* significantly improve performance of  ``rapidfuzz.string_metric.levenshtein`` for ``max <= 31``
  using a banded implementation

[1.9.1] - 2021-12-13
^^^^^^^^^^^^^^^^^^^^
Fixed
~~~~~
* fix bug in new editops implementation, causing it to SegFault on some inputs (see qurator-spk/dinglehopper#64)

[1.9.0] - 2021-12-11
^^^^^^^^^^^^^^^^^^^^
Fixed
~~~~~
* Fix some issues in the type annotations (see #163)

Performance
~~~~~~~~~~~
* improve performance and memory usage of ``rapidfuzz.string_metric.levenshtein_editops``

  * memory usage is reduced by 10x
  * performance is improved from ``O(N * M)`` to ``O([N / 64] * M)``

[1.8.3] - 2021-11-19
^^^^^^^^^^^^^^^^^^^^
Added
~~~~~
* Added missing wheels for Python3.6 on MacOs and Windows (see #159)

[1.8.2] - 2021-10-27
^^^^^^^^^^^^^^^^^^^^
Added
~~~~~
* Add wheels for Python 3.10 on MacOs

[1.8.1] - 2021-10-22
^^^^^^^^^^^^^^^^^^^^
Fixed
~~~~~
* Fix incorrect editops results (See #148)

[1.8.0] - 2021-10-20
^^^^^^^^^^^^^^^^^^^^
Changed
~~~~~~~
* Add Wheels for Python3.10 on all platforms except MacOs (see #141)
* Improve performance of ``string_metric.jaro_similarity`` and  ``string_metric.jaro_winkler_similarity`` for strings with a length <= 64

[1.7.1] - 2021-10-02
^^^^^^^^^^^^^^^^^^^^
Fixed
~~~~~
* fixed incorrect results of fuzz.partial_ratio for long needles (see #138)

[1.7.0] - 2021-09-27
^^^^^^^^^^^^^^^^^^^^
Changed
~~~~~~~
* Added typing for process.cdist
* Added multithreading support to cdist using the argument ``process.cdist``
* Add dtype argument to ``process.cdist`` to set the dtype of the result numpy array (see #132)
* Use a better hash collision strategy in the internal hashmap, which improves the worst case performance

[1.6.2] - 2021-09-15
^^^^^^^^^^^^^^^^^^^^
Changed
~~~~~~~
* improved performance of fuzz.ratio
* only import process.cdist when numpy is available

[1.6.1] - 2021-09-11
^^^^^^^^^^^^^^^^^^^^
Changed
~~~~~~~
* Add back wheels for Python2.7

[1.6.0] - 2021-09-10
^^^^^^^^^^^^^^^^^^^^
Changed
~~~~~~~
* fuzz.partial_ratio uses a new implementation for short needles (<= 64). This implementation is

  * more accurate than the current implementation (it is guaranteed to find the optimal alignment)
  * it is significantly faster

* Add process.cdist to compare all elements of two lists (see #51)

[1.5.1] - 2021-09-01
^^^^^^^^^^^^^^^^^^^^
Fixed
~~~~~
* Fix out of bounds access in levenshtein_editops

[1.5.0] - 2021-08-21
^^^^^^^^^^^^^^^^^^^^
Changed
~~~~~~~
* all scorers do now support similarity/distance calculations between any sequence of hashables. So it is possible to calculate e.g. the WER as:
  .. code-block::

     >>> string_metric.levenshtein(["word1", "word2"], ["word1", "word3"])
     1

Added
~~~~~
* Added type stub files for all functions
* added jaro similarity in ``string_metric.jaro_similarity``
* added jaro winkler similarity in ``string_metric.jaro_winkler_similarity``
* added Levenshtein editops in ``string_metric.levenshtein_editops``

Fixed
~~~~~
* Fixed support for set objects in ``process.extract``
* Fixed inconsistent handling of empty strings

[1.4.1] - 2021-03-30
^^^^^^^^^^^^^^^^^^^^
Performance
~~~~~~~~~~~
* improved performance of result creation in process.extract

Fixed
~~~~~
* Cython ABI stability issue (#95)
* fix missing decref in case of exceptions in process.extract

[1.4.0] - 2021-03-29
^^^^^^^^^^^^^^^^^^^^
Changed
~~~~~~~
* added processor support to ``levenshtein`` and ``hamming``
* added distance support to extract/extractOne/extract_iter

Fixed
~~~~~
* incorrect results of ``normalized_hamming`` and ``normalized_levenshtein`` when used with ``utils.default_process`` as processor

[1.3.3] - 2021-03-20
^^^^^^^^^^^^^^^^^^^^
Fixed
~~~~~
* Fix a bug in the mbleven implementation of the uniform Levenshtein distance and cover it with fuzz tests

[1.3.2] - 2021-03-20
^^^^^^^^^^^^^^^^^^^^
Fixed
~~~~~
* some of the newly activated warnings caused build failures in the conda-forge build

[1.3.1] - 2021-03-20
^^^^^^^^^^^^^^^^^^^^
Fixed
~~~~~
* Fixed issue in LCS calculation for partial_ratio (see #90)
* Fixed incorrect results for normalized_hamming and normalized_levenshtein when the processor ``utils.default_process`` is used
* Fix many compiler warnings

[1.3.0] - 2021-03-16
^^^^^^^^^^^^^^^^^^^^
Changed
~~~~~~~
* add wheels for a lot of new platforms
* drop support for Python 2.7

Performance
~~~~~~~~~~~
* use ``is`` instead of ``==`` to compare functions directly by address

Fixed
~~~~~
* Fix another ref counting issue
* Fix some issues in the Levenshtein distance algorithm (see #92)

[1.2.1] - 2021-03-08
^^^^^^^^^^^^^^^^^^^^
Performance
~~~~~~~~~~~
* further improve bitparallel implementation of uniform Levenshtein distance for strings with a length > 64 (in many cases more than 50% faster)

[1.2.0] - 2021-03-07
^^^^^^^^^^^^^^^^^^^^
Changed
~~~~~~~
* add more benchmarks to documentation

Performance
~~~~~~~~~~~
* add bitparallel implementation to InDel Distance (Levenshtein with the weights 1,1,2) for strings with a length > 64
* improve bitparallel implementation of uniform Levenshtein distance for strings with a length > 64
* use the InDel Distance and uniform Levenshtein distance in more cases instead of the generic implementation
* Directly use the Levenshtein implementation in C++ instead of using it through Python in process.*

[1.1.2] - 2021-03-03
^^^^^^^^^^^^^^^^^^^^
Fixed
~~~~~
* Fix reference counting in process.extract (see #81)

[1.1.1] - 2021-02-23
^^^^^^^^^^^^^^^^^^^^
Fixed
~~~~~
* Fix result conversion in process.extract (see #79)

[1.1.0] - 2021-02-21
^^^^^^^^^^^^^^^^^^^^
Changed
~~~~~~~
* string_metric.normalized_levenshtein supports now all weights
* when different weights are used for Insertion and Deletion the strings are not swapped inside the Levenshtein implementation anymore. So different weights for Insertion and Deletion are now supported.
* replace C++ implementation with a Cython implementation. This has the following advantages:

  * The implementation is less error prone, since a lot of the complex things are done by Cython
  * slightly faster than the current implementation (up to 10% for some parts)
  * about 33% smaller binary size
  * reduced compile time

* Added \*\*kwargs argument to process.extract/extractOne/extract_iter that is passed to the scorer
* Add max argument to hamming distance
* Add support for whole Unicode range to utils.default_process

Performance
~~~~~~~~~~~
* replaced Wagner Fischer usage in the normal Levenshtein distance with a bitparallel implementation

[1.0.2] - 2021-02-19
^^^^^^^^^^^^^^^^^^^^
Fixed
~~~~~
* The bitparallel LCS algorithm in fuzz.partial_ratio did not find the longest common substring properly in some cases.
  The old algorithm is used again until this bug is fixed.

[1.0.1] - 2021-02-17
^^^^^^^^^^^^^^^^^^^^
Changed
~~~~~~~
* string_metric.normalized_levenshtein supports now the weights (1, 1, N) with N >= 1

Performance
~~~~~~~~~~~
* The Levenshtein distance with the weights (1, 1, >2) do now use the same implementation as the weight (1, 1, 2), since
  ``Substitution > Insertion + Deletion`` has no effect

Fixed
~~~~~
* fix uninitialized variable in bitparallel Levenshtein distance with the weight (1, 1, 1)

[1.0.0] - 2021-02-12
^^^^^^^^^^^^^^^^^^^^
Changed
~~~~~~~
* all normalized string_metrics can now be used as scorer for process.extract/extractOne
* Implementation of the C++ Wrapper completely refactored to make it easier to add more scorers, processors and string matching algorithms in the future.
* increased test coverage, that already helped to fix some bugs and help to prevent regressions in the future
* improved docstrings of functions

Performance
~~~~~~~~~~~
* Added bit-parallel implementation of the Levenshtein distance for the weights (1,1,1) and (1,1,2).
* Added specialized implementation of the Levenshtein distance for cases with a small maximum edit distance, that is even faster, than the bit-parallel implementation.
* Improved performance of ``fuzz.partial_ratio``
  -> Since ``fuzz.ratio`` and ``fuzz.partial_ratio`` are used in most scorers, this improves the overall performance.
* Improved performance of ``process.extract`` and ``process.extractOne``

Deprecated
~~~~~~~~~~
* the ``rapidfuzz.levenshtein`` module is now deprecated and will be removed in v2.0.0
  These functions are now placed in ``rapidfuzz.string_metric``. ``distance``\ , ``normalized_distance``\ , ``weighted_distance`` and ``weighted_normalized_distance`` are combined into ``levenshtein`` and ``normalized_levenshtein``.

Added
~~~~~
* added normalized version of the hamming distance in ``string_metric.normalized_hamming``
* process.extract_iter as a generator, that yields the similarity of all elements, that have a similarity >= score_cutoff

Fixed
~~~~~
* multiple bugs in extractOne when used with a scorer, that's not from RapidFuzz
* fixed bug in ``token_ratio``
* fixed bug in result normalization causing zero division

[0.14.2] - 2020-12-31
^^^^^^^^^^^^^^^^^^^^^
Fixed
~~~~~
* utf8 usage in the copyright header caused problems with python2.7 on some platforms (see #70)

[0.14.1] - 2020-12-13
^^^^^^^^^^^^^^^^^^^^^
Fixed
~~~~~
* when a custom processor like ``lambda s: s`` was used with any of the methods inside fuzz.* it always returned a score of 100. This release fixes this and adds a better test coverage to prevent this bug in the future.

[0.14.0] - 2020-12-09
^^^^^^^^^^^^^^^^^^^^^
Added
~~~~~
* added hamming distance metric in the levenshtein module

Performance
~~~~~~~~~~~
* improved performance of default_process by using lookup table

[0.13.4] - 2020-11-30
^^^^^^^^^^^^^^^^^^^^^
Fixed
~~~~~
* Add missing virtual destructor that caused a segmentation fault on Mac Os

[0.13.3] - 2020-11-21
^^^^^^^^^^^^^^^^^^^^^
Added
~~~~~
* C++11 Support
* manylinux wheels

[0.13.2] - 2020-11-21
^^^^^^^^^^^^^^^^^^^^^
Fixed
~~~~~
* Levenshtein was not imported from __init__
* The reference count of a Python Object inside process.extractOne was decremented to early

[0.13.1] - 2020-11-17
^^^^^^^^^^^^^^^^^^^^^
Performance
~~~~~~~~~~~
* process.extractOne  exits early when a score of 100 is found. This way the other strings do not have to be preprocessed anymore.

[0.13.0] - 2020-11-16
^^^^^^^^^^^^^^^^^^^^^
Fixed
~~~~~
* string objects passed to scorers had to be strings even before preprocessing them. This was changed, so they only have to be strings after preprocessing similar to process.extract/process.extractOne

Performance
~~~~~~~~~~~
* process.extractOne is now implemented in C++ making it a lot faster
* When token_sort_ratio or partial_token_sort ratio is used inprocess.extractOne the words in the query are only sorted once to improve the runtime

Changed
~~~~~~~
* process.extractOne/process.extract do now return the index of the match, when the choices are a list.

Removed
~~~~~~~
* process.extractIndices got removed, since the indices are now already returned by process.extractOne/process.extract

[0.12.5] - 2020-10-26
^^^^^^^^^^^^^^^^^^^^^
Fixed
~~~~~
* fix documentation of process.extractOne (see #48)

[0.12.4] - 2020-10-22
^^^^^^^^^^^^^^^^^^^^^
Added
~~~~~
* Added wheels for

  * CPython 2.7 on windows 64 bit
  * CPython 2.7 on windows 32 bit
  * PyPy 2.7 on windows 32 bit

[0.12.3] - 2020-10-09
^^^^^^^^^^^^^^^^^^^^^
Fixed
~~~~~
* fix bug in partial_ratio (see #43)

[0.12.2] - 2020-10-01
^^^^^^^^^^^^^^^^^^^^^
Fixed
~~~~~
* fix inconsistency with fuzzywuzzy in partial_ratio when using strings of equal length

[0.12.1] - 2020-09-30
^^^^^^^^^^^^^^^^^^^^^
Fixed
~~~~~
* MSVC has a bug and therefore crashed on some of the templates used. This Release simplifies the templates so compiling on msvc works again

[0.12.0] - 2020-09-30
^^^^^^^^^^^^^^^^^^^^^
Performance
~~~~~~~~~~~
* partial_ratio is using the Levenshtein distance now, which is a lot faster. Since many of the other algorithms use partial_ratio, this helps to improve the overall performance

[0.11.3] - 2020-09-22
^^^^^^^^^^^^^^^^^^^^^
Fixed
~~~~~
* fix partial_token_set_ratio returning 100 all the time

[0.11.2] - 2020-09-12
^^^^^^^^^^^^^^^^^^^^^
Added
~~~~~
* added rapidfuzz.__author__, rapidfuzz.__license__ and rapidfuzz.__version__

[0.11.1] - 2020-09-01
^^^^^^^^^^^^^^^^^^^^^
Fixed
~~~~~
* do not use auto junk when searching the optimal alignment for partial_ratio

[0.11.0] - 2020-08-22
^^^^^^^^^^^^^^^^^^^^^
Changed
~~~~~~~
* support for python 2.7 added #40
* add wheels for python2.7 (both pypy and cpython) on MacOS and Linux

[0.10.0] - 2020-08-17
^^^^^^^^^^^^^^^^^^^^^
Changed
~~~~~~~
* added wheels for Python3.9

Fixed
~~~~~
* tuple scores in process.extractOne are now supported #39