<p  style="z-index: 101;background: #fde073;text-align: center;line-height: 2.5;overflow: hidden;font-size:22px;">Please <a href="https://www.pycm.io/doc/#Cite" target="_blank">cite us</a> if you use the software</p>

<script type="text/javascript">
      var sc_project = 12845182;
      var sc_invisible = 1;
      var sc_security = "001dde8f";
   </script>
   <script type="text/javascript" src="https://www.statcounter.com/counter/counter.js" async></script>
   <noscript>
      <div class="statcounter"><a title="Web Analytics" href="https://statcounter.com/" target="_blank"><img class="statcounter" src="https://c.statcounter.com/12845182/0/001dde8f/1/" alt="Web Analytics" referrerPolicy="no-referrer-when-downgrade"></a></div>
</noscript>

# Distance/Similarity

PyCM's `distance` method provides users with a wide range of string distance/similarity metrics to evaluate a confusion matrix by measuring its distance to a perfect confusion matrix. Distance/Similarity metrics measure the distance between two vectors of numbers. Small distances between two objects indicate similarity. In the PyCM's `distance` method, a distance measure can be chosen from `DistanceType`. The measures' names are chosen based on the namig style suggested in [[1]](#ref1).

In [1]:
from pycm import ConfusionMatrix, DistanceType

In [2]:
cm = ConfusionMatrix(matrix={0: {0: 3, 1: 0, 2: 0}, 1: {0: 0, 1: 1, 2: 2}, 2: {0: 2, 1: 1, 2: 3}})

$$TP \rightarrow True Positive$$
$$TN \rightarrow True Negative$$
$$FP \rightarrow False Positive$$
$$FN \rightarrow False Negative$$
$$POP \rightarrow Population$$

## AMPLE

AMPLE similarity [[2]](#ref2) [[3]](#ref3).

$$sim_{AMPLE}=|\frac{TP}{TP+FP}-\frac{FN}{FN+TN}|$$

In [3]:
cm.distance(metric=DistanceType.AMPLE)

{0: 0.6, 1: 0.3, 2: 0.17142857142857143}

<ul>
    <li><span style="color:red;">Notice </span> :  new in <span style="color:red;">version 3.8</span> </li>
</ul>

## Anderberg's D

Anderberg's D [[4]](#ref4).

$$sim_{Anderberg} =
\frac{(max(TP,FP)+max(FN,TN)+max(TP,FN)+max(FP,TN))-
(max(TP+FP,FP+TN)+max(TP+FP,FN+TN))}{2\times POP}$$

In [4]:
cm.distance(metric=DistanceType.Anderberg)

{0: 0.16666666666666666, 1: 0.0, 2: 0.041666666666666664}

<ul>
    <li><span style="color:red;">Notice </span> :  new in <span style="color:red;">version 3.8</span> </li>
</ul>

## Andres & Marzo's Delta

Andres & Marzo's Delta correlation [[5]](#ref5).

$$corr_{AndresMarzo_\Delta} = \Delta =
\frac{TP+TN-2 \times \sqrt{FP \times FN}}{POP}$$

In [5]:
cm.distance(metric=DistanceType.AndresMarzoDelta)

{0: 0.8333333333333334, 1: 0.5142977396044842, 2: 0.17508504286947035}

<ul>
    <li><span style="color:red;">Notice </span> :  new in <span style="color:red;">version 3.8</span> </li>
</ul>

## Baroni-Urbani & Buser I

Baroni-Urbani & Buser I similarity [[6]](#ref6).

$$sim_{BaroniUrbaniBuserI} =
\frac{\sqrt{TP\times TN}+TP}{\sqrt{TP\times TN}+TP+FP+FN}$$

In [6]:
cm.distance(metric=DistanceType.BaroniUrbaniBuserI)

{0: 0.79128784747792, 1: 0.5606601717798213, 2: 0.5638559245324765}

<ul>
    <li><span style="color:red;">Notice </span> :  new in <span style="color:red;">version 3.8</span> </li>
</ul>

## Baroni-Urbani & Buser II

Baroni-Urbani & Buser II correlation [[6]](#ref6).

$$corr_{BaroniUrbaniBuserII} =
\frac{\sqrt{TP \times TN}+TP-FP-FN}{\sqrt{TP \times TN}+TP+FP+FN}$$

In [7]:
cm.distance(metric=DistanceType.BaroniUrbaniBuserII)

{0: 0.58257569495584, 1: 0.12132034355964261, 2: 0.1277118490649528}

<ul>
    <li><span style="color:red;">Notice </span> :  new in <span style="color:red;">version 3.8</span> </li>
</ul>

## Batagelj & Bren

Batagelj & Bren distance [[7]](#ref7).

$$dist_{BatageljBren} =
\frac{FP \times FN}{TP \times TN}$$

In [8]:
cm.distance(metric=DistanceType.BatageljBren)

{0: 0.0, 1: 0.25, 2: 0.5}

<ul>
    <li><span style="color:red;">Notice </span> :  new in <span style="color:red;">version 3.8</span> </li>
</ul>

## Baulieu I

Baulieu I distance [[8]](#ref8).

$$sim_{BaulieuI} =
\frac{(TP+FP) \times (TP+FN)-TP^2}{(TP+FP) \times (TP+FN)}$$

In [9]:
cm.distance(metric=DistanceType.BaulieuI)

{0: 0.4, 1: 0.8333333333333334, 2: 0.7}

<ul>
    <li><span style="color:red;">Notice </span> :  new in <span style="color:red;">version 3.8</span> </li>
</ul>

## Baulieu II

Baulieu II similarity [[8]](#ref8).

$$sim_{BaulieuII} =
\frac{TP^2 \times TN^2}{(TP+FP) \times (TP+FN) \times (FP+TN) \times (FN+TN)}$$

In [10]:
cm.distance(metric=DistanceType.BaulieuII)

{0: 0.4666666666666667, 1: 0.11851851851851852, 2: 0.11428571428571428}

<ul>
    <li><span style="color:red;">Notice </span> :  new in <span style="color:red;">version 3.8</span> </li>
</ul>

## Baulieu III

Baulieu III distance [[8]](#ref8).

$$sim_{BaulieuIII} =
\frac{POP^2 - 4 \times (TP \times TN-FP \times FN)}{2 \times POP^2}$$

In [11]:
cm.distance(metric=DistanceType.BaulieuIII)

{0: 0.20833333333333334, 1: 0.4166666666666667, 2: 0.4166666666666667}

<ul>
    <li><span style="color:red;">Notice </span> :  new in <span style="color:red;">version 3.8</span> </li>
</ul>

## Baulieu IV

Baulieu IV distance [[9]](#ref9).

$$dist_{BaulieuIV} = \frac{FP+FN-(TP+\frac{1}{2})\times(TN+\frac{1}{2})\times TN  \times k}{POP}$$

In [12]:
cm.distance(metric=DistanceType.BaulieuIV)

{0: -41.45702383161246, 1: -22.855395541901885, 2: -13.85431293274332}

* The default value of k is Euler's number $e$

<ul>
    <li><span style="color:red;">Notice </span> :  new in <span style="color:red;">version 3.8</span> </li>
</ul>

## Baulieu V

Baulieu V distance [[9]](#ref9).

$$dist_{BaulieuV} = \frac{FP+FN+1}{TP+FP+FN+1}$$

In [13]:
cm.distance(metric=DistanceType.BaulieuV)

{0: 0.5, 1: 0.8, 2: 0.6666666666666666}

<ul>
    <li><span style="color:red;">Notice </span> :  new in <span style="color:red;">version 3.8</span> </li>
</ul>

## Baulieu VI

Baulieu VI distance [[9]](#ref9).

$$dist_{BaulieuVI} = \frac{FP+FN}{TP+FP+FN+1}$$

In [14]:
cm.distance(metric=DistanceType.BaulieuVI)

{0: 0.3333333333333333, 1: 0.6, 2: 0.5555555555555556}

<ul>
    <li><span style="color:red;">Notice </span> :  new in <span style="color:red;">version 3.8</span> </li>
</ul>

## Baulieu VII

Baulieu VII distance [[9]](#ref9).

$$dist_{BaulieuVII} = \frac{FP+FN}{POP + TP \times (TP-4)^2}$$

In [15]:
cm.distance(metric=DistanceType.BaulieuVII)

{0: 0.13333333333333333, 1: 0.14285714285714285, 2: 0.3333333333333333}

<ul>
    <li><span style="color:red;">Notice </span> :  new in <span style="color:red;">version 3.8</span> </li>
</ul>

## Baulieu VIII

Baulieu VIII distance [[9]](#ref9).

$$dist_{BaulieuVIII} = \frac{(FP-FN)^2}{POP^2}$$

In [16]:
cm.distance(metric=DistanceType.BaulieuVIII)

{0: 0.027777777777777776, 1: 0.006944444444444444, 2: 0.006944444444444444}

<ul>
    <li><span style="color:red;">Notice </span> :  new in <span style="color:red;">version 3.8</span> </li>
</ul>

## Baulieu IX

Baulieu IX distance [[9]](#ref9).

$$dist_{BaulieuIX} = \frac{FP+2 \times FN}{TP+FP+2 \times FN+TN}$$

In [17]:
cm.distance(metric=DistanceType.BaulieuIX)

{0: 0.16666666666666666, 1: 0.35714285714285715, 2: 0.5333333333333333}

<ul>
    <li><span style="color:red;">Notice </span> :  new in <span style="color:red;">version 3.8</span> </li>
</ul>

## Baulieu X

Baulieu X distance [[9]](#ref9).

$$dist_{BaulieuX} = \frac{FP+FN+max(FP,FN)}{POP+max(FP,FN)}$$

In [18]:
cm.distance(metric=DistanceType.BaulieuX)

{0: 0.2857142857142857, 1: 0.35714285714285715, 2: 0.5333333333333333}

<ul>
    <li><span style="color:red;">Notice </span> :  new in <span style="color:red;">version 3.8</span> </li>
</ul>

## Baulieu XI

Baulieu XI distance [[9]](#ref9).

$$dist_{BaulieuXI} = \frac{FP+FN}{FP+FN+TN}$$

In [19]:
cm.distance(metric=DistanceType.BaulieuXI)

{0: 0.2222222222222222, 1: 0.2727272727272727, 2: 0.5555555555555556}

<ul>
    <li><span style="color:red;">Notice </span> :  new in <span style="color:red;">version 3.8</span> </li>
</ul>

## Baulieu XII

Baulieu XII distance [[9]](#ref9).

$$dist_{BaulieuXII} = \frac{FP+FN}{TP+FP+FN-1}$$

In [20]:
cm.distance(metric=DistanceType.BaulieuXII)

{0: 0.5, 1: 1.0, 2: 0.7142857142857143}

<ul>
    <li><span style="color:red;">Notice </span> :  new in <span style="color:red;">version 3.8</span> </li>
</ul>

## Baulieu XIII

Baulieu XIII distance [[9]](#ref9).

$$dist_{BaulieuXIII} = \frac{FP+FN}{TP+FP+FN+TP \times (TP-4)^2}$$

In [21]:
cm.distance(metric=DistanceType.BaulieuXIII)

{0: 0.25, 1: 0.23076923076923078, 2: 0.45454545454545453}

<ul>
    <li><span style="color:red;">Notice </span> :  new in <span style="color:red;">version 3.8</span> </li>
</ul>

## Baulieu XIV

Baulieu XIV distance [[9]](#ref9).

$$dist_{BaulieuXIV} = \frac{FP+2 \times FN}{TP+FP+2 \times FN}$$

In [22]:
cm.distance(metric=DistanceType.BaulieuXIV)

{0: 0.4, 1: 0.8333333333333334, 2: 0.7272727272727273}

<ul>
    <li><span style="color:red;">Notice </span> :  new in <span style="color:red;">version 3.8</span> </li>
</ul>

## Baulieu XV

Baulieu XV distance [[9]](#ref9).

$$dist_{BaulieuXV} = \frac{FP+FN+max(FP, FN)}{TP+FP+FN+max(FP, FN)}$$

In [23]:
cm.distance(metric=DistanceType.BaulieuXV)

{0: 0.5714285714285714, 1: 0.8333333333333334, 2: 0.7272727272727273}

<ul>
    <li><span style="color:red;">Notice </span> :  new in <span style="color:red;">version 3.8</span> </li>
</ul>

## Benini I

Benini I correlation [[10]](#ref10).

$$corr_{BeniniI} = \frac{TP \times TN-FP \times FN}{(TP+FN)\times(FN+TN)}$$

In [24]:
cm.distance(metric=DistanceType.BeniniI)

{0: 1.0, 1: 0.2, 2: 0.14285714285714285}

<ul>
    <li><span style="color:red;">Notice </span> :  new in <span style="color:red;">version 3.8</span> </li>
</ul>

## Benini II

Benini II correlation [[10]](#ref10).

$$corr_{BeniniII} = \frac{TP \times TN-FP \times FN}{min((TP+FN)\times(FN+TN), (TP+FP)\times(FP+TN))}$$

In [25]:
cm.distance(metric=DistanceType.BeniniII)

{0: 1.0, 1: 0.3333333333333333, 2: 0.2}

<ul>
    <li><span style="color:red;">Notice </span> :  new in <span style="color:red;">version 3.8</span> </li>
</ul>

## Canberra

Canberra distance [[11]](#ref11) [[12]](#ref12).

$$sim_{Canberra} =
\frac{FP+FN}{(TP+FP)+(TP+FN)}$$

In [26]:
cm.distance(metric=DistanceType.Canberra)

{0: 0.25, 1: 0.6, 2: 0.45454545454545453}

<ul>
    <li><span style="color:red;">Notice </span> :  new in <span style="color:red;">version 3.8</span> </li>
</ul>

## Clement

Clement similarity [[13]](#ref13).

$$sim_{Clement} =
\frac{TP}{TP+FP}\times\Big(1 - \frac{TP+FP}{POP}\Big) +
\frac{TN}{FN+TN}\times\Big(1 - \frac{FN+TN}{POP}\Big)$$

In [27]:
cm.distance(metric=DistanceType.Clement)

{0: 0.7666666666666666, 1: 0.55, 2: 0.588095238095238}

<ul>
    <li><span style="color:red;">Notice </span> :  new in <span style="color:red;">version 3.8</span> </li>
</ul>

## Consonni & Todeschini I

Consonni & Todeschini I similarity [[14]](#ref14).

$$sim_{ConsonniTodeschiniI} =
\frac{log(1+TP+TN)}{log(1+POP)}$$

In [28]:
cm.distance(metric=DistanceType.ConsonniTodeschiniI)

{0: 0.9348704159880586, 1: 0.8977117175026231, 2: 0.8107144632819592}

<ul>
    <li><span style="color:red;">Notice </span> :  new in <span style="color:red;">version 3.8</span> </li>
</ul>

## Consonni & Todeschini II

Consonni & Todeschini II similarity [[14]](#ref14).

$$sim_{ConsonniTodeschiniII} =
\frac{log(1+POP)-log(1+FP+FN)}{log(1+POP)}$$

In [29]:
cm.distance(metric=DistanceType.ConsonniTodeschiniII)

{0: 0.5716826589686053, 1: 0.4595236911453605, 2: 0.3014445045412856}

<ul>
    <li><span style="color:red;">Notice </span> :  new in <span style="color:red;">version 3.8</span> </li>
</ul>

## Consonni & Todeschini III

Consonni & Todeschini III similarity [[14]](#ref14).

$$sim_{ConsonniTodeschiniIII} =
\frac{log(1+TP)}{log(1+POP)}$$

In [30]:
cm.distance(metric=DistanceType.ConsonniTodeschiniIII)

{0: 0.5404763088546395, 1: 0.27023815442731974, 2: 0.5404763088546395}

<ul>
    <li><span style="color:red;">Notice </span> :  new in <span style="color:red;">version 3.8</span> </li>
</ul>

## Consonni & Todeschini IV

Consonni & Todeschini IV similarity [[14]](#ref14).

$$sim_{ConsonniTodeschiniIV} =
\frac{log(1+TP)}{log(1+TP+FP+FN)}$$

In [31]:
cm.distance(metric=DistanceType.ConsonniTodeschiniIV)

{0: 0.7737056144690831, 1: 0.43067655807339306, 2: 0.6309297535714574}

<ul>
    <li><span style="color:red;">Notice </span> :  new in <span style="color:red;">version 3.8</span> </li>
</ul>

## Consonni & Todeschini V

Consonni & Todeschini V correlation [[14]](#ref14).

$$corr_{ConsonniTodeschiniV} =
\frac{log(1+TP \times TN)-log(1+FP \times FN)}{log(1+\frac{POP^2}{4})}$$

In [32]:
cm.distance(metric=DistanceType.ConsonniTodeschiniV)

{0: 0.8560267854703983, 1: 0.30424737289682985, 2: 0.17143541431350617}

<ul>
    <li><span style="color:red;">Notice </span> :  new in <span style="color:red;">version 3.8</span> </li>
</ul>

## Dennis

Dennis similarity [[15]](#ref15).

$$sim_{Dennis} =
\frac{TP-\frac{(TP+FP)\times(TP+FN)}{POP}}{\sqrt{\frac{(TP+FP)\times(TP+FN)}{POP}}}$$

In [33]:
cm.distance(metric=DistanceType.Dennis)

{0: 1.5652475842498528, 1: 0.7071067811865475, 2: 0.31622776601683794}

<ul>
    <li><span style="color:red;">Notice </span> :  new in <span style="color:red;">version 3.9</span> </li>
</ul>

## Digby

Digby correlation [[16]](#ref16).

$$corr_{Digby} =
\frac{(TP \times TN) ^\frac{3}{4}-(FP \times FN)^\frac{3}{4}}{(TP \times TN)^\frac{3}{4}+(FP \times FN)^\frac{3}{4}}$$

In [34]:
cm.distance(metric=DistanceType.Digby)

{0: 1.0, 1: 0.47759225007251715, 2: 0.2542302383508219}

<ul>
    <li><span style="color:red;">Notice </span> :  new in <span style="color:red;">version 3.9</span> </li>
</ul>

## Dispersion

Dispersion correlation [[17]](#ref17).

$$corr_{dispersion} =
\frac{TP \times TN -FP \times FN}{POP^2}
$$

In [35]:
cm.distance(metric=DistanceType.Dispersion)

{0: 0.14583333333333334, 1: 0.041666666666666664, 2: 0.041666666666666664}

<ul>
    <li><span style="color:red;">Notice </span> :  new in <span style="color:red;">version 3.9</span> </li>
</ul>

## Doolittle

Doolittle similarity [[18]](#ref18).

$$sim_{Doolittle} =
\frac{(TP\times POP - (TP+FP)\times(TP+FN))^2}{(TP+FP)\times(TP+FN)\times(FP+TN)\times(FN+TN)}$$

In [36]:
cm.distance(metric=DistanceType.Doolittle)

{0: 0.4666666666666667, 1: 0.06666666666666667, 2: 0.02857142857142857}

<ul>
    <li><span style="color:red;">Notice </span> :  new in <span style="color:red;">version 3.9</span> </li>
</ul>

## Eyraud

Eyraud similarity [[19]](#ref19).

$$sim_{Eyraud} =
\frac{TP-(TP+FP)\times(TP+FN)}{(TP+FP)\times(TP+FN)\times(FP+TN)\times(FN+TN)}$$

In [37]:
cm.distance(metric=DistanceType.Eyraud)

{0: -0.012698412698412698, 1: -0.009259259259259259, 2: -0.02142857142857143}

<ul>
    <li><span style="color:red;">Notice </span> :  new in <span style="color:red;">version 3.9</span> </li>
</ul>

## Fager & McGowan

Fager & McGowan similarity [[20]](#ref20) [[21]](#ref21).

$$sim_{FagerMcGowan} =
\frac{TP}{\sqrt{(TP+FP)\times(TP+FN)}} - \frac{1}{2\sqrt{max(TP+FP, TP+FN)}}$$

In [38]:
cm.distance(metric=DistanceType.FagerMcGowan)

{0: 0.5509898714915045, 1: 0.11957315586905015, 2: 0.3435984122732345}

<ul>
    <li><span style="color:red;">Notice </span> :  new in <span style="color:red;">version 3.9</span> </li>
</ul>

## Faith

Faith similarity [[22]](#ref22).

$$sim_{Faith} =
\frac{TP+\frac{TN}{2}}{POP}$$

In [39]:
cm.distance(metric=DistanceType.Faith)

{0: 0.5416666666666666, 1: 0.4166666666666667, 2: 0.4166666666666667}

<ul>
    <li><span style="color:red;">Notice </span> :  new in <span style="color:red;">version 3.9</span> </li>
</ul>

## Fleiss-Levin-Paik

Fleiss-Levin-Paik similarity [[23]](#ref23).

$$sim_{FleissLevinPaik} =
\frac{2 \times TN}{2 \times TN + FP + FN}$$

In [40]:
cm.distance(metric=DistanceType.FleissLevinPaik)

{0: 0.875, 1: 0.8421052631578947, 2: 0.6153846153846154}

<ul>
    <li><span style="color:red;">Notice </span> :  new in <span style="color:red;">version 3.9</span> </li>
</ul>

## Forbes I

Forbes I similarity [[24]](#ref24) [[25]](#ref25).

$$sim_{ForbesI} =
\frac{POP \times TP}{(TP+FP)\times(TP+FN)}$$

In [41]:
cm.distance(metric=DistanceType.ForbesI)

{0: 2.4, 1: 2.0, 2: 1.2}

<ul>
    <li><span style="color:red;">Notice </span> :  new in <span style="color:red;">version 3.9</span> </li>
</ul>

## Forbes II

Forbes II correlation [[26]](#ref26).

$$corr_{ForbesII} =
\frac{FP \times FN-TP \times TN}{(TP+FP)\times(TP+FN) - POP \times min(TP+FP, TP+FN)}$$

In [42]:
cm.distance(metric=DistanceType.ForbesII)

{0: 1.0, 1: 0.3333333333333333, 2: 0.2}

<ul>
    <li><span style="color:red;">Notice </span> :  new in <span style="color:red;">version 3.9</span> </li>
</ul>

## Fossum

Fossum similarity [[27]](#ref27).

$$sim_{Fossum} =
\frac{POP \times (TP-\frac{1}{2})^2}{(TP+FP)\times(TP+FN)}$$

In [43]:
cm.distance(metric=DistanceType.Fossum)

{0: 5.0, 1: 0.5, 2: 2.5}

<ul>
    <li><span style="color:red;">Notice </span> :  new in <span style="color:red;">version 3.9</span> </li>
</ul>

## Gilbert & Wells

Gilbert & Wells similarity [[28]](#ref28).

$$sim_{GilbertWells} =
ln \frac{POP^3}{2\pi (TP+FP)\times(TP+FN)\times(FP+TN)\times(FN+TN)} +
2ln \frac{POP! \times TP! \times FP! \times FN! \times TN!}{(TP+FP)! \times (TP+FN)! \times (FP+TN)! \times (FN+TN)!}$$

In [44]:
cm.distance(metric=DistanceType.GilbertWells)

{0: 4.947742862177545, 1: 1.1129094954405283, 2: 0.4195337173255813}

<ul>
    <li><span style="color:red;">Notice </span> :  new in <span style="color:red;">version 3.9</span> </li>
</ul>

## Goodall

Goodall similarity [[29]](#ref29) [[30]](#ref30).

$$sim_{Goodall} =\frac{2}{\pi} \sin^{-1}\Big(
\sqrt{\frac{TP + TN}{POP}}
\Big)$$

In [45]:
cm.distance(metric=DistanceType.Goodall)

{0: 0.7322795271987701, 1: 0.6666666666666666, 2: 0.5533003790381138}

<ul>
    <li><span style="color:red;">Notice </span> :  new in <span style="color:red;">version 3.9</span> </li>
</ul>

## Goodman & Kruskal's Lambda

Goodman & Kruskal's Lambda similarity [[31]](#ref31).

$$sim_{GK_\lambda} =
\frac{\frac{1}{2}((max(TP,FP)+max(FN,TN)+max(TP,FN)+max(FP,TN))-
(max(TP+FP,FN+TN)+max(TP+FN,FP+TN)))}
{POP-\frac{1}{2}(max(TP+FP,FN+TN)+max(TP+FN,FP+TN))}$$

In [46]:
cm.distance(metric=DistanceType.GoodmanKruskalLambda)

{0: 0.5, 1: 0.0, 2: 0.09090909090909091}

<ul>
    <li><span style="color:red;">Notice </span> :  new in <span style="color:red;">version 3.9</span> </li>
</ul>

## Goodman & Kruskal Lambda-r

Goodman & Kruskal Lambda-r correlation [[31]](#ref31).

$$corr_{GK_{\lambda_r}} =
\frac{TP + TN - \frac{1}{2}(max(TP+FP,FN+TN)+max(TP+FN,FP+TN))}
{POP - \frac{1}{2}(max(TP+FP,FN+TN)+max(TP+FN,FP+TN))}
$$

In [47]:
cm.distance(metric=DistanceType.GoodmanKruskalLambdaR)

{0: 0.5, 1: -0.2, 2: 0.09090909090909091}

<ul>
    <li><span style="color:red;">Notice </span> :  new in <span style="color:red;">version 3.9</span> </li>
</ul>

## Guttman's Lambda A

Guttman's Lambda A similarity [[32]](#ref32).

$$sim_{Guttman_{\lambda_a}} =
\frac{max(TP, FN) + max(FP, TN) - max(TP+FP, FN+TN)}{POP - max(TP+FP, FN+TN)}
$$

In [48]:
cm.distance(metric=DistanceType.GuttmanLambdaA)

{0: 0.6, 1: 0.0, 2: 0.0}

<ul>
    <li><span style="color:red;">Notice </span> :  new in <span style="color:red;">version 3.9</span> </li>
</ul>

## Guttman's Lambda B

Guttman's Lambda B similarity [[32]](#ref32).

$$sim_{Guttman_{\lambda_b}} =
\frac{max(TP, FP) + max(FN, TN) - max(TP+FN, FP+TN)}{POP - max(TP+FN, FP+TN)}
$$

In [49]:
cm.distance(metric=DistanceType.GuttmanLambdaB)

{0: 0.3333333333333333, 1: 0.0, 2: 0.16666666666666666}

<ul>
    <li><span style="color:red;">Notice </span> :  new in <span style="color:red;">version 3.9</span> </li>
</ul>

## Hamann

Hamann correlation [[33]](#ref33).

$$corr_{Hamann} =
\frac{TP+TN-FP-FN}{POP}
$$

In [50]:
cm.distance(metric=DistanceType.Hamann)

{0: 0.6666666666666666, 1: 0.5, 2: 0.16666666666666666}

<ul>
    <li><span style="color:red;">Notice </span> :  new in <span style="color:red;">version 3.9</span> </li>
</ul>

## Harris & Lahey

Harris & Lahey similarity [[34]](#ref34).

$$sim_{HarrisLahey} =
\frac{TP}{TP+FP+FN} \times \frac{2TN+FP+FN}{2POP}+
\frac{TN}{TN+FP+FN} \times \frac{2TP+FP+FN}{2POP}
$$

In [51]:
cm.distance(metric=DistanceType.HarrisLahey)

{0: 0.6592592592592592, 1: 0.3494318181818182, 2: 0.4068287037037037}

<ul>
    <li><span style="color:red;">Notice </span> :  new in <span style="color:red;">version 3.9</span> </li>
</ul>

## Hawkins & Dotson

Hawkins & Dotson similarity [[35]](#ref35).

$$sim_{HawkinsDotson} =
\frac{1}{2} \times \Big(\frac{TP}{TP+FP+FN}+\frac{TN}{FP+FN+TN}\Big)
$$

In [52]:
cm.distance(metric=DistanceType.HawkinsDotson)

{0: 0.6888888888888889, 1: 0.48863636363636365, 2: 0.4097222222222222}

<ul>
    <li><span style="color:red;">Notice </span> :  new in <span style="color:red;">version 3.9</span> </li>
</ul>

## Kendall's Tau

Kendall's Tau correlation [[36]](#ref36).

$$corr_{KendallTau} =
\frac{2 \times (TP+TN-FP-FN)}{POP \times (POP-1)}
$$

In [53]:
cm.distance(metric=DistanceType.KendallTau)

{0: 0.12121212121212122, 1: 0.09090909090909091, 2: 0.030303030303030304}

<ul>
    <li><span style="color:red;">Notice </span> :  new in <span style="color:red;">version 3.9</span> </li>
</ul>

## Kent & Foster I

Kent & Foster I similarity [[37]](#ref37).

$$sim_{KentFosterI} =
\frac{TP-\frac{(TP+FP)\times(TP+FN)}{TP+FP+FN}}{TP-\frac{(TP+FP)\times(TP+FN)}{TP+FP+FN}+FP+FN}
$$

In [54]:
cm.distance(metric=DistanceType.KentFosterI)

{0: 0.0, 1: -0.2, 2: -0.17647058823529413}

<ul>
    <li><span style="color:red;">Notice </span> :  new in <span style="color:red;">version 3.9</span> </li>
</ul>

## Kent & Foster II

Kent & Foster II similarity [[37]](#ref37).

$$sim_{KentFosterII} =
\frac{TN-\frac{(FP+TN)\times(FN+TN)}{FP+FN+TN}}{TN-\frac{(FP+TN)\times(FP+TN)}{FP+FN+TN}+FP+FN}
$$

In [55]:
cm.distance(metric=DistanceType.KentFosterII)

{0: 0.0, 1: -0.06451612903225801, 2: -0.15384615384615394}

<ul>
    <li><span style="color:red;">Notice </span> :  new in <span style="color:red;">version 3.9</span> </li>
</ul>

## Köppen I

Köppen I correlation [[38]](#ref38).

$$sim_{KoppenI} =
\frac{\frac{2 \times TP+FP+FN}{2}.\frac{2 \times TN+FP+FN}{2} - \frac{FP+FN}{2}}
{\frac{2 \times TP+FP+FN}{2}.\frac{2 \times TN+FP+FN}{2}}
$$

In [56]:
cm.distance(metric=DistanceType.KoppenI)

{0: 0.96875, 1: 0.9368421052631579, 2: 0.9300699300699301}

<ul>
    <li><span style="color:red;">Notice </span> :  new in <span style="color:red;">version 4.1</span> </li>
</ul>

## Köppen II

Köppen II correlation [[38]](#ref38).

$$sim_{KoppenII} =
TP + \frac{FP + FN}{2}
$$

In [57]:
cm.distance(metric=DistanceType.KoppenII)

{0: 4.0, 1: 2.5, 2: 5.5}

<ul>
    <li><span style="color:red;">Notice </span> :  new in <span style="color:red;">version 4.1</span> </li>
</ul>

## Kuder & Richardson

Kuder & Richardson correlation [[39]](#ref39).

$$corr_{KuderRichardson} =
\frac{4 \times (TP \times TN - FP \times FN)}
{(TP+FP)(FN+TN) + (TP+FN)(FP+TN) + 2(TP \times TN - FP \times FN)}
$$

In [58]:
cm.distance(metric=DistanceType.KuderRichardson)

{0: 0.8076923076923077, 1: 0.4067796610169492, 2: 0.2891566265060241}

<ul>
    <li><span style="color:red;">Notice </span> :  new in <span style="color:red;">version 4.1</span> </li>
</ul>

## Kuhns I

Kuhns I correlation [[40]](#ref40).

$$corr_{KuhnsI} =
\frac{2 \times \delta(TP + FP, TP + FN)}
{N}
$$

$$
\delta(TP + FP, TP + FN) = TP - \frac{(TP + FP) \times (TP + FN)}{N}
$$

In [59]:
cm.distance(metric=DistanceType.KuhnsI)

{0: 0.2916666666666667, 1: 0.08333333333333333, 2: 0.08333333333333333}

<ul>
    <li><span style="color:red;">Notice </span> :  new in <span style="color:red;">version 4.1</span> </li>
</ul>

## Kuhns II

Kuhns II correlation [[40]](#ref40).

$$corr_{KuhnsII} =
\frac{\delta(TP + FP, TP + FN)}
{\max(TP + FP, TP + FN)}
$$

$$
\delta(TP + FP, TP + FN) = TP - \frac{(TP + FP) \times (TP + FN)}{N}
$$

In [60]:
cm.distance(metric=DistanceType.KuhnsII)

{0: 0.35, 1: 0.16666666666666666, 2: 0.08333333333333333}

<ul>
    <li><span style="color:red;">Notice </span> :  new in <span style="color:red;">version 4.1</span> </li>
</ul>

## Kuhns III

Kuhns III correlation [[40]](#ref40).

$$corr_{KuhnsIII} =
\frac{\delta(TP + FP, TP + FN)}
{(1-\frac{TP}{2 \times TP + FP + FN})(2 \times TP + FP + FN-\frac{(TP + FP)(TP + FN)}{N})}
$$

$$
\delta(TP + FP, TP + FN) = TP - \frac{(TP + FP) \times (TP + FN)}{N}
$$

In [61]:
cm.distance(metric=DistanceType.KuhnsIII)

{0: 0.4148148148148148, 1: 0.1388888888888889, 2: 0.08088235294117647}

<ul>
    <li><span style="color:red;">Notice </span> :  new in <span style="color:red;">version 4.2</span> </li>
</ul>

## Kuhns IV

Kuhns IV correlation [[40]](#ref40).

$$corr_{KuhnsIV} =
\frac{\delta(TP + FP, TP + FN)}
{\min(TP + FP, TP + FN)}
$$

$$
\delta(TP + FP, TP + FN) = TP - \frac{(TP + FP) \times (TP + FN)}{N}
$$

In [62]:
cm.distance(metric=DistanceType.KuhnsIV)

{0: 0.5833333333333334, 1: 0.25, 2: 0.1}

<ul>
    <li><span style="color:red;">Notice </span> :  new in <span style="color:red;">version 4.2</span> </li>
</ul>

## Kuhns V

Kuhns V correlation [[40]](#ref40).

$$corr_{KuhnsV} =
\frac{\delta(TP + FP, TP + FN)}
{\max((TP+FP)(1-\frac{TP+FP}{N}), (TP+FN)(1-\frac{TP+FN}{N}))}
$$

$$
\delta(TP + FP, TP + FN) = TP - \frac{(TP + FP) \times (TP + FN)}{N}
$$

In [63]:
cm.distance(metric=DistanceType.KuhnsV)

{0: 0.6000000000000001, 1: 0.2222222222222222, 2: 0.16666666666666666}

<ul>
    <li><span style="color:red;">Notice </span> :  new in <span style="color:red;">version 4.2</span> </li>
</ul>

## Kuhns VI

Kuhns VI correlation [[40]](#ref40).

$$corr_{KuhnsVI} =
\frac{\delta(TP + FP, TP + FN)}
{\min((TP+FP)(1-\frac{TP+FP}{N}), (TP+FN)(1-\frac{TP+FN}{N}))}
$$

$$
\delta(TP + FP, TP + FN) = TP - \frac{(TP + FP) \times (TP + FN)}{N}
$$

In [64]:
cm.distance(metric=DistanceType.KuhnsVI)

{0: 0.7777777777777778, 1: 0.3, 2: 0.17142857142857146}

<ul>
    <li><span style="color:red;">Notice </span> :  new in <span style="color:red;">version 4.2</span> </li>
</ul>

## Kuhns VII

Kuhns VII correlation [[40]](#ref40).

$$corr_{KuhnsVII} =
\frac{\delta(TP + FP, TP + FN)}
{\sqrt{(TP + FP) \times (TP + FN)}}
$$

$$
\delta(TP + FP, TP + FN) = TP - \frac{(TP + FP) \times (TP + FN)}{N}
$$

In [65]:
cm.distance(metric=DistanceType.KuhnsVII)

{0: 0.45184805705753195, 1: 0.20412414523193154, 2: 0.09128709291752768}

<ul>
    <li><span style="color:red;">Notice </span> :  new in <span style="color:red;">version 4.2</span> </li>
</ul>

## References

<blockquote id="ref1">1- C. C. Little, "Abydos Documentation," 2018.</blockquote>

<blockquote id="ref2">2- V. Dallmeier, C. Lindig, and A. Zeller, "Lightweight defect localization for Java," in <i>European conference on object-oriented programming</i>, 2005: Springer, pp. 528-550.</blockquote>

<blockquote id="ref3">3- R. Abreu, P. Zoeteweij, and A. J. Van Gemund, "An evaluation of similarity coefficients for software fault localization," in 2006 <i>12th Pacific Rim International Symposium on Dependable Computing (PRDC'06)</i>, 2006: IEEE, pp. 39-46.</blockquote>

<blockquote id="ref4">4- M. R. Anderberg, <i>Cluster analysis for applications: probability and mathematical statistics: a series of monographs and textbooks</i>. Academic press, 2014.</blockquote>

<blockquote id="ref5">5- A. M. Andrés and P. F. Marzo, "Delta: A new measure of agreement between two raters," <i>British journal of mathematical and statistical psychology</i>, vol. 57, no. 1, pp. 1-19, 2004.</blockquote>

<blockquote id="ref6">6- C. Baroni-Urbani and M. W. Buser, "Similarity of binary data," <i>Systematic Zoology</i>, vol. 25, no. 3, pp. 251-259, 1976.</blockquote>

<blockquote id="ref7">7- V. Batagelj and M. Bren, "Comparing resemblance measures," <i>Journal of classification</i>, vol. 12, no. 1, pp. 73-90, 1995.</blockquote>

<blockquote id="ref8">8- F. B. Baulieu, "A classification of presence/absence based dissimilarity coefficients," <i>Journal of Classification</i>, vol. 6, no. 1, pp. 233-246, 1989.</blockquote>

<blockquote id="ref9">9- F. B. Baulieu, "Two variant axiom systems for presence/absence based dissimilarity coefficients," <i>Journal of Classification</i>, vol. 14, no. 1, pp. 0159-0170, 1997.</blockquote>

<blockquote id="ref10">10- R. Benini, <i>Principii di demografia</i>. Barbera, 1901.</blockquote>

<blockquote id="ref11">11- G. N. Lance and W. T. Williams, "Computer programs for hierarchical polythetic classification (“similarity analyses”)," <i>The Computer Journal</i>, vol. 9, no. 1, pp. 60-64, 1966.</blockquote>

<blockquote id="ref12">12- G. N. Lance and W. T. Williams, "Mixed-Data Classificatory Programs I - Agglomerative Systems," <i>Australian Computer Journal</i>, vol. 1, no. 1, pp. 15-20, 1967.</blockquote>

<blockquote id="ref13">13- P. W. Clement, "A formula for computing inter-observer agreement," <i>Psychological Reports</i>, vol. 39, no. 1, pp. 257-258, 1976.</blockquote>

<blockquote id="ref14">14- V. Consonni and R. Todeschini, "New similarity coefficients for binary data," <i>Match-Communications in Mathematical and Computer Chemistry</i>, vol. 68, no. 2, p. 581, 2012.</blockquote>

<blockquote id="ref15">15- S. F. Dennis, "The Construction of a Thesaurus Automatically From," in <i>Statistical Association Methods for Mechanized Documentation: Symposium Proceedings</i>, 1965, vol. 269: US Government Printing Office, p. 61.</blockquote>

<blockquote id="ref16">16- P. G. Digby, "Approximating the tetrachoric correlation coefficient," <i>Biometrics</i>, pp. 753-757, 1983.</blockquote>

<blockquote id="ref17">17- IBM Corp, "IBM SPSS Statistics Algorithms," <i>ed: IBM Corp Armonk</i>, NY, USA, 2017.</blockquote>

<blockquote id="ref18">18- M. H. Doolittle, "The verification of predictions," <i>Bulletin of the Philosophical Society of Washington</i>, vol. 7, pp. 122-127, 1885.</blockquote>

<blockquote id="ref19">19- H. Eyraud, "Les principes de la mesure des correlations," <i>Ann. Univ. Lyon</i>, III. Ser., Sect. A, vol. 1, no. 30-47, p. 111, 1936.</blockquote>

<blockquote id="ref20">20- E. W. Fager, "Determination and analysis of recurrent groups," <i>Ecology</i>, vol. 38, no. 4, pp. 586-595, 1957.</blockquote>

<blockquote id="ref21">21- E. W. Fager and J. A. McGowan, "Zooplankton Species Groups in the North Pacific: Co-occurrences of species can be used to derive groups whose members react similarly to water-mass types," <i>Science</i>, vol. 140, no. 3566, pp. 453-460, 1963.</blockquote>

<blockquote id="ref22">22- D. P. Faith, "Asymmetric binary similarity measures," <i>Oecologia</i>, vol. 57, pp. 287-290, 1983.</blockquote>

<blockquote id="ref23">23- J. L. Fleiss, B. Levin, and M. C. Paik, <i>Statistical methods for rates and proportions.</i> john wiley & sons, 2013.</blockquote>

<blockquote id="ref24">24- S. A. Forbes, <i>On the local distribution of certain Illinois fishes: an essay in statistical ecology.</i> Illinois State Laboratory of Natural History, 1907.</blockquote>

<blockquote id="ref25">25- A. Mozley, "The statistical analysis of the distribution of pond molluscs in western Canada," <i>The American Naturalist</i>, vol. 70, no. 728, pp. 237-244, 1936.</blockquote>

<blockquote id="ref26">26- S. A. Forbes, "Method of determining and measuring the associative relations of species," <i>Science</i>, vol. 61, no. 1585, pp. 518-524, 1925.</blockquote>

<blockquote id="ref27">27- E. G. Fossum and G. Kaskey, "Optimization and standardization of information retrieval language and systems," <i>SPERRY RAND CORP PHILADELPHIA PA UNIVAC DIV</i>, 1966.</blockquote>

<blockquote id="ref28">28- N. Gilbert and T. C. Wells, "Analysis of quadrat data," <i>The Journal of Ecology</i>, pp. 675-685, 1966.</blockquote>

<blockquote id="ref29">29- D. W. Goodall, "The distribution of the matching coefficient," <i>Biometrics</i>, pp. 647-656, 1967.</blockquote>

<blockquote id="ref30">30- B. Austin and R. R. Colwell, "Evaluation of some coefficients for use in numerical taxonomy of microorganisms," <i>International Journal of Systematic and Evolutionary Microbiology</i>, vol. 27, no. 3, pp. 204-210, 1977.</blockquote>

<blockquote id="ref31">31- L. A. Goodman, W. H. Kruskal, L. A. Goodman, and W. H. Kruskal, <i>Measures of association for cross classifications.</i> Springer, 1979.</blockquote>

<blockquote id="ref32">32- L. Guttman, "An outline of the statistical theory of prediction," <i>The prediction of personal adjustment</i>, vol. 48, pp. 253-318, 1941.</blockquote>

<blockquote id="ref33">33- U. Hamann, "Merkmalsbestand und verwandtschaftsbeziehungen der farinosae: ein beitrag zum system der monokotyledonen," <i>Willdenowia</i>, pp. 639-768, 1961.</blockquote>

<blockquote id="ref34">34- F. C. Harris and B. B. Lahey, "A method for combining occurrence and nonoccurrence interobserver agreement scores," <i>Journal of Applied Behavior Analysis</i>, vol. 11, no. 4, pp. 523-527, 1978.</blockquote>

<blockquote id="ref35">35- R. P. Hawkins and V. A. Dotson, "Reliability Scores That Delude: An Alice in Wonderland Trip Through the Misleading Characteristics of Inter-Observer Agreement Scores in Interval Recording," 1973.</blockquote>

<blockquote id="ref36">36- M. G. Kendall, "A new measure of rank correlation," <i>Biometrika</i>, vol. 30, no. 1/2, pp. 81-93, 1938.</blockquote>

<blockquote id="ref37">37- R. N. Kent and S. L. Foster, "Direct observational procedures: Methodological issues in naturalistic settings," <i>Handbook of behavioral assessment</i>, pp. 279-328, 1977.</blockquote>

<blockquote id="ref38">38- W. Köppen, "In Repertorium für Meteorologie," <i>Akademiia Nauk</i>, pp. 189–238, 1870.</blockquote>

<blockquote id="ref39">39- G. F. Kuder and M. W. Richardson, "The theory of the estimation of test reliability," <i>Psychometrika</i>, pp. 151–160, 1937.</blockquote>

<blockquote id="ref40">40- J. L. Kuhns, "Statistical Association Methods for Mechanized Documentation," <i>National Bureau of Standards Miscellaneous Publication</i>, pp. 33-40, 1964.</blockquote>