File: pearson_test.rst

package info (click to toggle)
openturns 1.26-4
  • links: PTS, VCS
  • area: main
  • in suites: forky, sid
  • size: 67,708 kB
  • sloc: cpp: 261,605; python: 67,030; ansic: 4,378; javascript: 406; sh: 185; xml: 164; makefile: 101
file content (60 lines) | stat: -rw-r--r-- 2,564 bytes parent folder | download
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
.. _pearson_test:

Pearson correlation test
------------------------

The Pearson test checks if there exists a linear relationship between two random
variables :math:`X` and :math:`Y`.

The Pearson test is based on the Pearson correlation coefficient defined in
:ref:`Pearson coefficient <pearson_coefficient>`. It tests if the Pearson correlation
coefficient is significantly different from zero. In the case where :math:`(X, Y)` form a Gaussian
vector, it is equivalent to test the independence between :math:`X` and :math:`Y`.

The Pearson test compares the null hypothesis :math:`\cH_0 = \left\{ \rho_P(X,Y) = 0 \right\}` against
the alternative hypothesis :math:`\cH_1 = \left\{ \rho_P(X,Y) \neq 0 \right\}`.

The Pearson coefficient :math:`\rho_P(X,Y)` is evaluated on a sample generated by the
bivariate random vector :math:`(X,Y)` of size :math:`\sampleSize` and denoted by
:math:`\hat{\rho}_P(X,Y)` according to the relation :eq:`PearsonEstim`.

The statistics :math:`T(X,Y)` used in the test is defined by:

.. math::
  T(X,Y) = \hat{\rho}_P(X,Y) \sqrt{ \dfrac{\sampleSize-2}{1-(\hat{\rho}_P(X,Y))^2} }

Under the null hypothesis :math:`\cH_0`, the statistics :math:`T` follows a Student
distribution with :math:`\sampleSize-2` degrees of freedom in the case of a Gaussian vector. In the other
cases, the Student distribution :math:`T(\sampleSize-2)` is equivalent to the asymptotic distribution of
:math:`T`. The library uses the Student distribution :math:`T(\sampleSize-2)` in all the cases.

The p-value :math:`p_v` is the probability :math:`p_v = \Prob{|T| \geq |t(X,Y)|}`
where :math:`t(X,Y)` is the realization of
:math:`T(X,Y)` computed on the sample. The null hypothesis
:math:`\cH_0` is rejected if :math:`p_v < p_v^\ell` where  :math:`p_v^\ell` is specified
(usually 0.1 or 0.05). The p-value limit :math:`p_v^\ell` is the probability to wrongly reject the null hypothesis
:math:`\cH_0`, which
means to commit a Type I error.

When the null hypothesis :math:`\cH_0` is rejected, it means that there is a significant linear
relationship between :math:`X` and :math:`Y`.

.. topic:: API:

    - See :py:func:`~openturns.HypothesisTest.Pearson`
    - See :py:func:`~openturns.HypothesisTest.PartialPearson`
    - See :py:func:`~openturns.HypothesisTest.FullPearson`

.. topic:: Examples:

    - See :doc:`/auto_data_analysis/statistical_tests/plot_test_independence`

.. topic:: References:

    - [saporta1990]_
    - [dixon1983]_
    - [nisthandbook]_
    - [dagostino1986]_
    - [bhattacharyya1997]_
    - [sprent2001]_
    - [burnham2002]_