File: plot_sensitivity_sobol.py

package info (click to toggle)
openturns 1.24-4
  • links: PTS, VCS
  • area: main
  • in suites: forky, sid, trixie
  • size: 66,204 kB
  • sloc: cpp: 256,662; python: 63,381; ansic: 4,414; javascript: 406; sh: 180; xml: 164; yacc: 123; makefile: 98; lex: 55
file content (224 lines) | stat: -rw-r--r-- 7,107 bytes parent folder | download
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
"""
Estimate Sobol' indices for the Ishigami function by a sampling method: a quick start guide to sensitivity analysis
===================================================================================================================
"""

# %%
#
# In this example, we estimate the Sobol' indices for the
# :ref:`Ishigami function <use-case-ishigami>` by sampling methods.
#

# %%
# Introduction
# ------------
#
# In this example we are going to quantify the correlation between the input
# variables and the output variable of a model thanks to Sobol indices.
#
# Sobol indices are designed to evaluate the importance of a single variable
# or a specific set of variables.
# Here the Sobol indices are estimated by sampling from the distributions of
# the input variables and propagating uncertainty through a function.
#
# In theory, Sobol indices range from 0 to 1; the closer an index value is
# to 1, the better the associated input variable explains the function output.
#
# Let us denote by :math:`d` the input dimension of the model.
#
# Sobol' indices can be computed at different orders.
#
# * First order indices evaluate the importance of one input variable
#   at a time.
#
# * Total indices give the relative importance of one input variable
#   and all its interactions with other variables.
#   Alternatively, they can be viewed as measuring how much wriggle room
#   remains to the output when all but one input variables are fixed.
#
# * In general, we are only interested in first order and total Sobol' indices.
#   There are situations, however, where we want to estimate interactions.
#   Second order indices evaluate the importance of every pair of input variables.
#   The number of second order indices is:
#
# .. math::
#    \binom{d}{2} = \frac{d \times \left( d-1\right)}{2}.
#
# In practice, when the number of input variables is not small (say,
# when :math:`d>5`), then the number of second order indices is too large
# to be easily analyzed.
# In these situations, we limit the analysis to the first order and total
# Sobol' indices.

# %%
# Define the model
# ----------------

# %%
from openturns.usecases import ishigami_function
import openturns as ot
import openturns.viewer
import openturns.viewer as viewer
from matplotlib import pylab as plt

ot.Log.Show(ot.Log.NONE)

# %%
# We load the Ishigami model from the usecases model :
im = ishigami_function.IshigamiModel()

# %%
# The :class:`~openturns.usecases.ishigami_function.IshigamiModel` data class contains the input distribution
# :math:`\vect{X}=(X_1, X_2, X_3)` in `im.inputDistribution` and the Ishigami
# function in `im.model`.
# We also have access to the input variable names with:
input_names = im.inputDistribution.getDescription()

# %%
# Draw the function
# -----------------

# %%
n = 10000
sampleX = im.inputDistribution.getSample(n)
sampleY = im.model(sampleX)

# %%
# Display relationships between the output and the inputs

# %%
grid = ot.VisualTest.DrawPairsXY(sampleX, sampleY)
_ = ot.viewer.View(grid, figure_kw={"figsize": (10.0, 4.0)})

# %%
graph = ot.HistogramFactory().build(sampleY).drawPDF()
view = viewer.View(graph)

# %%
# We see that the distribution of the output has two modes.

# %%
# Estimate the Sobol' indices
# ---------------------------

# %%
# We first create a design of experiments with the `SobolIndicesExperiment`.
# Since we are not interested in second order indices for the moment,
# we use the default value of the third argument (we will come back to this
# topic later).

# %%
size = 1000
sie = ot.SobolIndicesExperiment(im.inputDistribution, size)
inputDesign = sie.generate()
input_names = im.inputDistribution.getDescription()
inputDesign.setDescription(input_names)
inputDesign.getSize()

# %%
# We see that 5000 function evaluations are required to estimate the first
# order and total Sobol' indices.

# %%
# Then we evaluate the outputs corresponding to this design of experiments.

# %%
outputDesign = im.model(inputDesign)

# %%
# Then we estimate the Sobol' indices with the `SaltelliSensitivityAlgorithm`.

# %%
sensitivityAnalysis = ot.SaltelliSensitivityAlgorithm(inputDesign, outputDesign, size)

# %%
# Let us estimate first order and total Sobol' indices.

# %%
sensitivityAnalysis.getFirstOrderIndices()

# %%
sensitivityAnalysis.getTotalOrderIndices()

# %%
# In the following graph, the vertical bars represent
# the 95% confidence intervals of the estimates.

# %%
graph = sensitivityAnalysis.draw()
view = viewer.View(graph)

# %%
# - We see that the variable :math:`X_1`, with a total Sobol' index close
#   to 0.6, is the most significant variable, taking into account both its direct
#   effect and its interactions with other variables.
#   Its first order index is close to 0.3, which implies that its interactions
#   alone produce almost 30% (0.6 - 0.3) of the total variance.
# - The variable :math:`X_2` has the highest first order index: approximately 0.4.
#   However, it has little interaction with other variables since its total
#   order index is close to its first order index.
# - The variable :math:`X_3` has a first order index close to zero.
#   However, it has an impact on the total variance thanks to its interactions
#   with :math:`X_1`.
# - We see that the variability of the estimates is quite high even with a
#   relatively large sample size.
#   Moreover, since the exact first order Sobol' index for :math:`X_3` is zero,
#   its estimate has a 50% chance of being nonpositive.

# %%
# Estimate the second order indices
# ---------------------------------

# %%
size = 1000
computeSecondOrder = True
sie = ot.SobolIndicesExperiment(im.inputDistribution, size, computeSecondOrder)
inputDesign = sie.generate()
print(inputDesign.getSize())
inputDesign.setDescription(input_names)
outputDesign = im.model(inputDesign)

# %%
# We see that 8000 function evaluations are now required; that is 3000 more
# evaluations than in the previous situation.

# %%
sensitivityAnalysis = ot.SaltelliSensitivityAlgorithm(inputDesign, outputDesign, size)

# %%
second_order = sensitivityAnalysis.getSecondOrderIndices()
for i in range(im.dim):
    for j in range(i):
        print("2nd order index (%d,%d)=%g" % (i, j, second_order[i, j]))

# %%
# This shows that the only significant interaction is the one between :math:`X_1`
# and :math:`X_3` (beware of Python's index shift: 0 denotes the first input variable).

# %%
# Using a different estimator
# ---------------------------
#
# We have used the `SaltelliSensitivityAlgorithm` class to estimate the indices.
# Others are available in the library:
#
# * `SaltelliSensitivityAlgorithm`
# * `MartinezSensitivityAlgorithm`
# * `JansenSensitivityAlgorithm`
# * `MauntzKucherenkoSensitivityAlgorithm`
#

# %%
# In order to compare the results with another method, we use the
# `MartinezSensitivityAlgorithm` class.

# %%
sensitivityAnalysis = ot.MartinezSensitivityAlgorithm(inputDesign, outputDesign, size)

# %%
graph = sensitivityAnalysis.draw()
view = viewer.View(graph)

plt.show()
# %%
# We see that the results do not change significantly in this particular situation.