1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127
|
# Usage
## Matcher
The [pfzy](https://github.com/jhawthorn/fzy) package provides an async entry point [fuzzy_match](https://pfzy.readthedocs.io/en/latest/pages/api.html#pfzy.match.fuzzy_match) to perform
fuzzy matching using a given string against a given list of strings and will perform ranking automatically.
```{code-block} python
---
caption: main.py
---
import asyncio
from pfzy import fuzzy_match
async def main():
return await fuzzy_match("ab", ["acb", "acbabc"])
if __name__ == "__main__":
print(asyncio.run(main()))
```
```{code-block} python
>>> python main.py
[{"value": "acbabc", "indices": [3, 4]}, {"value": "acb", "indices": [0, 2]}]
```
### Matching against dictionaries
The second argument can also be a list of dictionary but you'll have to also specify the argument `key` so that
the function knows which key in the dictionary contains the value to match.
```{code-block} python
import asyncio
from pfzy import fuzzy_match
result = asyncio.run(fuzzy_match("ab", [{"val": "acb"}, {"val": "acbabc"}], key="val"))
```
```{code-block} python
>>> print(result)
[{"val": "acbabc", "indices": [3, 4]}, {"val": "acb", "indices": [0, 2]}]
```
### Using different scorer
By default, it uses the [fzy_scorer](#fzy_scorer) to perform string matching if not specified. You can
explicitly set a different scorer using the argument `scorer`. Reference [#Scorer](#scorer) for a list of
available scorers.
```{code-block} python
import asyncio
from pfzy import fuzzy_match, substr_scorer
result = asyncio.run(fuzzy_match("ab", ["acb", "acbabc"], scorer=substr_scorer))
```
```{code-block} python
>>> print(result)
[{'value': 'acbabc', 'indices': [3, 4]}]
```
## Scorer
### [fzy_scorer](https://pfzy.readthedocs.io/en/latest/pages/api.html#pfzy.score.fzy_scorer)
```{Tip}
The higher the score, the higher the string similarity.
```
The `fzy_scorer` uses [fzy](https://github.com/jhawthorn/fzy) matching logic to perform string fuzzy
matching.
The returned value is a tuple with the matching score and the matching indices.
```{code-block} python
from pfzy import fzy_scorer
score, indices = fzy_scorer("ab", "acbabc")
```
```{code-block} python
>>> print(score)
0.98
>>> print(indices)
[3, 4]
```
### [substr_scorer](https://pfzy.readthedocs.io/en/latest/pages/api.html#pfzy.score.substr_scorer)
```{Note}
The score returned by `substr_scorer` might be negative value, but it doesn't mean its not a match.
As a rule of thumb, the higher the score, the higher the string similarity.
```
Use this scorer when exact substring matching is preferred. Different than the [fzy_scorer](#fzy_scorer),
`substr_scorer` only performs exact matching and the score calculation works differently.
The returned value is a tuple with the matching score and the matching indices.
```{code-block} python
from pfzy import substr_scorer
score, indices = substr_scorer("ab", "awsab")
```
```{code-block} python
>>> print(score)
-1.3
>>> print(indices)
[3, 4]
```
```{code-block} python
from pfzy import substr_scorer
score, indices = substr_scorer("ab", "asdafswabc")
```
```{code-block} python
>>> print(score)
-1.6388888888888888
>>> print(indices)
[7, 8]
```
|