File: monte_carlo_moments.rst

package info (click to toggle)
openturns 1.24-4
  • links: PTS, VCS
  • area: main
  • in suites: forky, sid, trixie
  • size: 66,204 kB
  • sloc: cpp: 256,662; python: 63,381; ansic: 4,414; javascript: 406; sh: 180; xml: 164; yacc: 123; makefile: 98; lex: 55
file content (104 lines) | stat: -rw-r--r-- 4,607 bytes parent folder | download | duplicates (2)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
.. _monte_carlo_moments:

Estimating moments with Monte Carlo
-----------------------------------

Let us denote
:math:`\outputRV = \model\left( \inputRV \right) = \left( Y^1,\ldots,Y^{\outputDim} \right)`,
where :math:`\inputRV= \left( X^1,\ldots,X^{\inputDim} \right)` is a random
vector. We seek here to
evaluate, the characteristics of the central part (central tendency and
spread i.e. mean and variance) of the probability distribution of a
variable :math:`Y^i`, using the probability distribution of the random
vector :math:`\inputRV`.

The Monte Carlo method is a numerical integration method using sampling,
which can be used, for example, to determine the mean and standard
deviation of a random variable :math:`Y^i` (if these quantities exist,
which is not the case for all probability distributions):

.. math::

   \begin{aligned}
       m_{Y^i} = \int u \, f_{Y^i}(u) \, du,\ \sigma_{Y^i} = \sqrt{\int \left( u-m_{Y^i} \right)^2 \, f_{Y^i}(u) \, du}
     \end{aligned}

where :math:`f_{Y^i}` represents the probability density function of
:math:`Y^i`.

Suppose now that we have the sample
:math:`\left\{ y^i_1,\ldots,y^i_\sampleSize \right\}` of :math:`\sampleSize` values randomly
and independently sampled from the probability distribution
:math:`f_{Y^i}`; this sample can be obtained by drawing a :math:`\sampleSize`
sample :math:`\left\{ \vect{x}_1,\ldots,\vect{x}_\sampleSize \right\}` of the
random vector :math:`\inputRV` (the distribution of which is known) and
by computing
:math:`\vect{y}_j =  \model \left( \vect{x}_j \right) \ \forall 1 \leq j \leq \sampleSize`.
Then, the Monte-Carlo estimations for the mean and standard deviation
are the empirical mean and standard deviations of the sample:

.. math::

   \begin{aligned}
       \widehat{m}_{Y^i} = \frac{1}{\sampleSize} \sum_{j=1}^\sampleSize y^i_j,\ \widehat{\sigma}_{Y^i} = \sqrt{\frac{1}{\sampleSize} \sum_{j=1}^\sampleSize \left( y^i_j - \widehat{m}_{Y^i} \right)^2}
     \end{aligned}

These are just estimations, but by the law of large numbers their
convergence to the real values :math:`m_{Y^i}` and :math:`\sigma_{Y^i}`
is assured as the sample size :math:`\sampleSize` tends to infinity. The Central
Limit Theorem enables the difference between the estimated value and the
sought value to be controlled by means of a confidence interval
(especially if :math:`\sampleSize` is sufficiently large, typically :math:`\sampleSize` > a few
dozens even if there is now way to say for sure if the asymptotic
behavior is reached). For a probability :math:`\alpha` strictly between
:math:`0` and :math:`1` chosen by the user, one can, for example, be sure with a
confidence :math:`\alpha`, that the true value of :math:`m_{Y^i}` is
between :math:`\widehat{m}_{i,\inf}` and :math:`\widehat{m}_{i,\sup}`
calculated analytically from simple formulae. To illustrate, for
:math:`\alpha = 0.95`:

.. math::

   \begin{aligned}
       \widehat{m}_{i,\inf} = \widehat{m}_{Y^i} - 1.96 \frac{\displaystyle \widehat{\sigma}_{Y^i}}{\displaystyle \sqrt{\sampleSize}},\ \widehat{m}_{i,\sup} = \widehat{m}_{Y^i} + 1.96 \frac{\widehat{\sigma}_{Y^i}}{\sqrt{\sampleSize}},\ \textrm{that is to say}\ \textrm{Pr} \left(  \widehat{m}_{i,\inf} \leq m_{Y^i} \leq \widehat{m}_{i,\sup} \right) = 0.95
     \end{aligned}

The size of the confidence interval, which represents the uncertainty
of this mean estimation, decreases as :math:`\sampleSize` increases but more
gradually (the rate is proportional to :math:`\sqrt{\sampleSize}`: multiplying
:math:`\sampleSize` by :math:`100` reduces the length of the confidence interval
:math:`\left| \widehat{m}_{i,\inf}-\widehat{m}_{i,\sup} \right|` by a
factor :math:`10`).

This method is also referred to as Direct sampling, crude Monte Carlo method, Classical Monte Carlo
integration.

.. plot::

    import openturns as ot
    from openturns.viewer import View

    dist = ot.LogNormal()
    dist.setDescription(['Z'])
    sample = dist.getSample(100)
    graph = dist.drawPDF()
    histogram = ot.HistogramFactory().build(sample).drawPDF().getDrawable(0)
    histogram.setColor('blue')
    graph.add(histogram)
    View(graph)

.. topic:: API:

    - See :class:`~openturns.Sample`

.. topic:: Examples:

    - See :doc:`/auto_data_analysis/manage_data_and_samples/plot_estimate_moments`


.. topic:: References:

    - Robert C.P., Casella G. (2004). "Monte Carlo Statistical Methods", Springer, ISBN 0-387-21239-6, 2nd ed.
    - Rubinstein R.Y. (1981). "Simulation and The Monte Carlo methods", John Wiley \& Sons
    - "Guide to the expression of Uncertainty in Measurements (GUM)", ISO publication