File: arcsine.qbk

package info (click to toggle)
boost1.88 1.88.0-1
  • links: PTS, VCS
  • area: main
  • in suites: forky, sid, trixie
  • size: 576,932 kB
  • sloc: cpp: 4,149,234; xml: 136,789; ansic: 35,092; python: 33,910; asm: 5,698; sh: 4,604; ada: 1,681; makefile: 1,633; pascal: 1,139; perl: 1,124; sql: 640; yacc: 478; ruby: 271; java: 77; lisp: 24; csh: 6
file content (290 lines) | stat: -rw-r--r-- 11,904 bytes parent folder | download | duplicates (5)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
[section:arcine_dist Arcsine Distribution]

[import ../../example/arcsine_example.cpp] [/ for arcsine snips below]


``#include <boost/math/distributions/arcsine.hpp>``

   namespace boost{ namespace math{

    template <class RealType = double,
              class ``__Policy``   = ``__policy_class`` >
   class arcsine_distribution;

   typedef arcsine_distribution<double> arcsine; // double precision standard arcsine distribution [0,1].

   template <class RealType, class ``__Policy``>
   class arcsine_distribution
   {
   public:
      typedef RealType  value_type;
      typedef Policy    policy_type;

      // Constructor from two range parameters, x_min and x_max:
      BOOST_MATH_GPU_ENABLED arcsine_distribution(RealType x_min = 0, RealType x_max = 1);

      // Range Parameter accessors:
      BOOST_MATH_GPU_ENABLED RealType x_min() const;
      BOOST_MATH_GPU_ENABLED RealType x_max() const;
   };
   }} // namespaces

The class type `arcsine_distribution` represents an
[@http://en.wikipedia.org/wiki/arcsine_distribution arcsine]
[@http://en.wikipedia.org/wiki/Probability_distribution probability distribution function].
The arcsine distribution is named because its CDF uses the inverse sin[super -1] or arcsine.

This is implemented as a generalized version with support from ['x_min] to ['x_max]
providing the 'standard arcsine distribution' as default with ['x_min = 0] and ['x_max = 1].
(A few make other choices for 'standard').

The arcsine distribution is generalized to include any bounded support ['a <= x <= b] by
[@http://reference.wolfram.com/language/ref/ArcSinDistribution.html Wolfram] and
[@http://en.wikipedia.org/wiki/arcsine_distribution Wikipedia],
but also using ['location] and ['scale] parameters by
[@http://www.math.uah.edu/stat/index.html Virtual Laboratories in Probability and Statistics]
[@http://www.math.uah.edu/stat/special/Arcsine.html Arcsine distribution].
The end-point version is simpler and more obvious, so we implement that.
If desired, [@http://en.wikipedia.org/wiki/arcsine_distribution this]
outlines how the __beta_distrib can be used to add a shape factor.

The [@http://en.wikipedia.org/wiki/Probability_density_function probability density function PDF]
for the [@http://en.wikipedia.org/wiki/arcsine_distribution arcsine distribution]
defined on the interval \[['x_min, x_max]\] is given by:

[expression f(x; x_min, x_max) = 1 /([pi][sdot][sqrt]((x - x_min)[sdot](x_max - x_min))]

For example, __WolframAlpha  arcsine distribution, from input of

   N[PDF[arcsinedistribution[0, 1], 0.5], 50]

computes the PDF value

   0.63661977236758134307553505349005744813783858296183

The Probability Density Functions (PDF) of generalized arcsine distributions are symmetric U-shaped curves,
centered on ['(x_max - x_min)/2],
highest (infinite) near the two extrema, and quite flat over the central region.

If random variate ['x] is ['x_min] or  ['x_max], then the PDF is infinity.
If random variate ['x] is ['x_min] then the CDF is zero.
If random variate ['x] is ['x_max] then the CDF is unity.

The 'Standard' (0, 1) arcsine distribution is shown in blue
and some generalized examples with other ['x] ranges.

[graph arcsine_pdf]

The Cumulative Distribution Function CDF is defined as

[expression F(x) = 2[sdot]arcsin([sqrt]((x-x_min)/(x_max - x))) / [pi]]

[graph arcsine_cdf]

[h5 Constructor]

   arcsine_distribution(RealType x_min, RealType x_max);

constructs an arcsine distribution with range parameters ['x_min] and ['x_max].

Requires ['x_min < x_max], otherwise __domain_error is called.

For example:

   arcsine_distribution<> myarcsine(-2, 4);

constructs an arcsine distribution with  ['x_min = -2] and ['x_max = 4].

Default values of  ['x_min = 0] and ['x_max = 1] and a ` typedef arcsine_distribution<double> arcsine;`  mean that

  arcsine as;

constructs a 'Standard 01' arcsine distribution.

[h5 Parameter Accessors]

   BOOST_MATH_GPU_ENABLED RealType x_min() const;
   BOOST_MATH_GPU_ENABLED RealType x_max() const;

Return the parameter ['x_min] or  ['x_max] from which this distribution was constructed.

So, for example:

[arcsine_snip_8]

[h4 Non-member Accessor Functions]

All the [link math_toolkit.dist_ref.nmp usual non-member accessor functions]
that are generic to all distributions are supported: __usual_accessors.
For this distribution all non-member accessor functions are marked with `BOOST_MATH_GPU_ENABLED` and can
be run on both host and device.

The formulae for calculating these are shown in the table below, and at
[@http://mathworld.wolfram.com/arcsineDistribution.html Wolfram Mathworld].

[note There are always [*two] values for the [*mode], at ['x_min] and at ['x_max], default 0 and 1,
so instead we raise the exception __domain_error.
At these extrema, the PDFs are infinite, and the CDFs zero or unity.]

[h4 Applications]

The arcsine distribution is useful to describe
[@http://en.wikipedia.org/wiki/Random_walk Random walks], (including drunken walks)
[@http://en.wikipedia.org/wiki/Brownian_motion Brownian motion],
[@http://en.wikipedia.org/wiki/Wiener_process  Weiner processes],
[@http://en.wikipedia.org/wiki/Bernoulli_trial Bernoulli trials],
and their application to solve  stock market and other
[@http://en.wikipedia.org/wiki/Gambler%27s_ruin ruinous gambling games].

The random variate ['x] is constrained to ['x_min] and ['x_max], (for our 'standard' distribution, 0 and 1),
and is usually some fraction.  For any other ['x_min] and ['x_max] a fraction can be obtained from ['x] using

[expression fraction = (x - x_min) / (x_max - x_min)]

The simplest example is tossing heads and tails with a fair coin and modelling the risk of losing, or winning.
Walkers (molecules, drunks...) moving  left or right of a centre line are another common example.

The random variate ['x] is the fraction of time spent on the 'winning' side.
If half the time is spent on the 'winning' side (and so the other half on the 'losing' side) then ['x = 1/2].

For large numbers of tosses, this is modelled by the (standard \[0,1\]) arcsine distribution,
and the PDF can be calculated thus:

[arcsine_snip_2]

From the plot of PDF, it is clear that  ['x] = [frac12] is the [*minimum] of the curve,
so this is the [*least likely] scenario.
(This is highly counter-intuitive, considering that fair tosses must [*eventually] become equal.
It turns out that ['eventually] is not just very long, but [*infinite]!).

The [*most likely] scenarios are towards the extrema where ['x] = 0 or ['x] = 1.

If fraction of time on the left is a [frac14],
it is only slightly more likely because the curve is quite flat bottomed.

[arcsine_snip_3]

If we consider fair coin-tossing games being played for 100 days
(hypothetically continuously to be 'at-limit')
the person winning after day 5 will not change in fraction 0.144 of the cases.

We can easily compute this setting ['x] = 5./100 = 0.05

[arcsine_snip_4]

Similarly, we can compute from a fraction of 0.05 /2 = 0.025
(halved because we are considering both winners and losers)
corresponding to 1 - 0.025 or 97.5% of the gamblers, (walkers, particles...) on the [*same side] of the origin

[arcsine_snip_5]

(use of the complement gives a bit more clarity,
and avoids potential loss of accuracy when ['x] is close to unity, see __why_complements).

[arcsine_snip_6]

or we can reverse the calculation by assuming a fraction of time on one side, say fraction 0.2,

[arcsine_snip_7]

[*Summary]: Every time we toss, the odds are equal,
so on average we have the same change of winning and losing.

But this is [*not true] for an an individual game where one will be [*mostly in a bad or good patch].

This is quite counter-intuitive to most people, but the mathematics is clear,
and gamblers continue to provide proof.

[*Moral]: if you in a losing patch, leave the game.
(Because the odds to recover to a good patch are poor).

[*Corollary]: Quit while you are ahead?

A working example is at [@../../example/arcsine_example.cpp  arcsine_example.cpp]
including sample output .

[h4 Related distributions]

The arcsine distribution with ['x_min = 0]  and ['x_max = 1] is special case of the
__beta_distrib with [alpha] = 1/2 and [beta] = 1/2.

[h4 Accuracy]

This distribution is implemented using sqrt, sine, cos and arc sine and cos trigonometric functions
which are normally accurate to a few __epsilon.
But all values suffer from [@http://en.wikipedia.org/wiki/Loss_of_significance loss of significance or cancellation error]
for values of ['x] close to ['x_max].
For example, for a standard [0, 1] arcsine distribution ['as], the pdf is symmetric about random variate ['x = 0.5]
so that one would expect `pdf(as, 0.01) == pdf(as, 0.99)`.  But as ['x] nears unity, there is increasing
[@http://en.wikipedia.org/wiki/Loss_of_significance loss of significance].
To counteract this, the complement versions of CDF and quantile
are implemented with alternative expressions using ['cos[super -1]] instead of ['sin[super -1]].
Users should see __why_complements for guidance on when to avoid loss of accuracy by using complements.

[h4 Testing]
The results were tested against a few accurate spot values computed by __WolframAlpha, for example:

      N[PDF[arcsinedistribution[0, 1], 0.5], 50]
      0.63661977236758134307553505349005744813783858296183

[h4 Implementation]

In the following table ['a] and ['b] are the parameters ['x_min] and ['x_max],
['x] is the random variable, ['p] is the probability and its complement ['q = 1-p].

[table
[[Function][Implementation Notes]]
[[support] [x [isin] \[a, b\], default x [isin] \[0, 1\] ]]
[[pdf] [f(x; a, b) = 1/([pi][sdot][sqrt](x - a)[sdot](b - x))]]
[[cdf] [F(x) = 2/[pi][sdot]sin[super-1]([sqrt](x - a) / (b - a) ) ]]
[[cdf of complement] [2/([pi][sdot]cos[super-1]([sqrt](x - a) / (b - a)))]]
[[quantile] [-a[sdot]sin[super 2]([frac12][pi][sdot]p) + a + b[sdot]sin[super 2]([frac12][pi][sdot]p)]]
[[quantile from the complement] [-a[sdot]cos[super 2]([frac12][pi][sdot]p) + a + b[sdot]cos[super 2]([frac12][pi][sdot]q)]]
[[mean] [[frac12](a+b)]]
[[median] [[frac12](a+b)]]
[[mode] [ x [isin] \[a, b\], so raises domain_error (returning NaN).]]
[[variance] [(b - a)[super 2] / 8]]
[[skewness] [0]]
[[kurtosis excess] [ -3/2  ]]
[[kurtosis] [kurtosis_excess + 3]]
]

The quantile was calculated using an expression obtained by using __WolframAlpha
to invert the formula for the CDF thus

  solve [p - 2/pi sin^-1(sqrt((x-a)/(b-a))) = 0, x]

which was interpreted as

  Solve[p - (2 ArcSin[Sqrt[(-a + x)/(-a + b)]])/Pi == 0, x, MaxExtraConditions -> Automatic]

and produced the resulting expression

[expression x = -a sin^2((pi p)/2)+a+b sin^2((pi p)/2)]

Thanks to Wolfram for providing this facility.

[h4 References]

* [@http://en.wikipedia.org/wiki/arcsine_distribution Wikipedia arcsine distribution]
* [@http://en.wikipedia.org/wiki/Beta_distribution Wikipedia Beta distribution]
* [@http://mathworld.wolfram.com/BetaDistribution.html Wolfram MathWorld]
* [@http://www.wolframalpha.com/ Wolfram Alpha]

[h4 Sources]

*[@http://estebanmoro.org/2009/04/the-probability-of-going-through-a-bad-patch The probability of going through a bad patch]  Esteban Moro's Blog.
*[@http://www.gotohaggstrom.com/What%20do%20schmucks%20and%20the%20arc%20sine%20law%20have%20in%20common.pdf  What soschumcks and the arc sine have in common] Peter Haggstrom.
*[@http://www.math.uah.edu/stat/special/Arcsine.html arcsine distribution].
*[@http://reference.wolfram.com/language/ref/ArcSinDistribution.html Wolfram reference arcsine examples].
*[@http://www.math.harvard.edu/library/sternberg/slides/1180908.pdf Shlomo Sternberg slides].


[endsect] [/section:arcsine_dist arcsine]

[/ arcsine.qbk
  Copyright 2014 John Maddock and Paul A. Bristow.
  Distributed under the Boost Software License, Version 1.0.
  (See accompanying file LICENSE_1_0.txt or copy at
  http://www.boost.org/LICENSE_1_0.txt).
]