File: continuous_kstwobign.rst

package info (click to toggle)
scipy 1.6.0-2
  • links: PTS, VCS
  • area: main
  • in suites: bullseye
  • size: 132,464 kB
  • sloc: python: 207,830; ansic: 92,105; fortran: 76,906; cpp: 68,145; javascript: 32,742; makefile: 422; pascal: 421; sh: 158
file content (54 lines) | stat: -rwxr-xr-x 2,422 bytes parent folder | download | duplicates (2)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54

.. _continuous-kstwobign:

KStwobign Distribution
======================

This is the limiting distribution of the normalized maximum absolute differences between an
empirical distribution function, computed from :math:`n` samples or observations,
and a comparison (or target) cumulative distribution function.  (``ksone`` is the distribution
of the unnormalized positive differences, :math:`D_n^+`.)

Writing :math:`D_n = \sup_t \left|F_{empirical,n}(t) - F_{target}(t)\right|`,
the normalization factor is :math:`\sqrt{n}`, and ``kstwobign`` is the limiting distribution
of the :math:`\sqrt{n} D_n` values as :math:`n\rightarrow\infty`.

Note that :math:`D_n=\max(D_n^+, D_n^-)`, but :math:`D_n^+` and :math:`D_n^-` are not independent.

``kstwobign`` can also be used with the differences between two empirical distribution functions,
for sets of observations with :math:`m` and :math:`n` samples respectively,
where :math:`m` and :math:`n` are "big".
Writing :math:`D_{m,n} = \sup_t \left|F_{1,m}(t)-F_{2,n}(t)\right|`,  where
:math:`F_{1,m}` and :math:`F_{2,n}` are the two empirical distribution functions, then
``kstwobign`` is also the limiting distribution of the :math:`\sqrt{\left(\frac{mn}{m+n}\right)D_{m,n}}` values,
as :math:`m,n\rightarrow\infty` and :math:`m/n\rightarrow a \ne 0, \infty`.

There are no shape parameters, and the support is :math:`x\in\left[0,\infty\right)`.


.. math::
    :nowrap:

    \begin{eqnarray*}  F\left(x\right) & = & 1 - 2 \sum_{k=1}^{\infty} (-1)^{k-1} e^{-2k^2 x^2}\\
    & = & \frac{\sqrt{2\pi}}{x} \sum_{k=1}^{\infty} e^{-(2k-1)^2 \pi^2/(8x^2)}\\
    & = & 1 - \textrm{scipy.special.kolmogorov}(n, x) \\
    f\left(x\right) & = & 8x \sum_{k=1}^{\infty} (-1)^{k-1} k^2 e^{-2k^2 x^2} \end{eqnarray*}


References
----------

-  "Kolmogorov-Smirnov test", Wikipedia
   https://en.wikipedia.org/wiki/Kolmogorov-Smirnov_test

-  Kolmogoroff, A. "Confidence Limits for an Unknown Distribution Function.""
   *Ann. Math. Statist.* 12 (1941), no. 4, 461--463.

-  Smirnov, N. "On the estimation of the discrepancy between empirical curves of distribution for two independent samples"
   *Bull. Math. Univ. Moscou.*, 2 (1039), 2-26.

-  Feller, W. "On the Kolmogorov-Smirnov Limit Theorems for Empirical Distributions."
   *Ann. Math. Statist.* 19 (1948), no. 2, 177--189. and "Errata"  *Ann. Math. Statist.* 21 (1950), no. 2, 301--302.


Implementation: `scipy.stats.kstwobign`