File: order.shtml

package info (click to toggle)
oakleaf 1.0.1-1
  • links: PTS, VCS
  • area: main
  • in suites: forky, sid, trixie
  • size: 2,536 kB
  • sloc: sh: 3,972; f90: 337; makefile: 174; awk: 11
file content (234 lines) | stat: -rw-r--r-- 6,006 bytes parent folder | download
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
<!DOCTYPE html>
<html lang="en">
  <!-- #include "meta.shtml" -->
<body>
  <!-- #include "head.shtml" -->

<h1>Quantil mean</h1>

  <h2>Synopsis</h2>

<pre class="fortran">

use oakleaf

subroutine <span class="subroutine">qmean</span>(data,mean,stderr,stdsig,flag)
   real, dimension(:), intent(in) :: data
   real, intent(out) :: mean
   real, intent(out), optional :: stderr,stdsig
   logical, intent(in), optional :: verbose
   integer, intent(out), optional :: flag
end subroutine qmean
</pre>

   <h2>Description</h2>

  <p>
    This routine estimates both averadge and standard deviation
    of a data set on base of the
    <a href="https://en.wikipedia.org/wiki/Order_statistic">order statistics</a>,
    i.e.  <a href="https://en.wikipedia.org/wiki/Quantile_function">quantiles.</a>
    The median, the <i>Q(0.5)</i> quantile, is claimed as the (robust) mean.
  </p>
  <p>
    The standard deviation (and consequently the standard error)
    is estimated preferably
    by the <i>Q<sub>n</sub></i>-estimator due Rousseeuw & Croux;
    one is more durable on
    highly contamined samples, yet additional memory space is required.
    In other cases (<i>N &lt; 3</i> or <i>N &gt; 2**16</i>),
    the deviations is determined as <i>MAD/0.6745</i>,
    where <i>MAD</i> is the median of absolute deviations.
  </p>

<p>
  Both the median and the <i>Q<sub>n</sub>,MAD</i> are robust estimators,
  however they
  are generally incompatible to least-square estimates, as being
  granulated and deviated for contamined data.
  </p>

<p>
  The <a href="https://en.wikipedia.org/wiki/Quicksort">quick sort</a> algorithm
  is used for small samples <i>N &le; 42</i>;
  one requires <i>N</i> log <i>N</i> operations.
  The larger samples are estimated by the quick median which needs
  <i>2N</i> operations
  (Wirth,D.: Algorithm and data structures, chapter Finding the Median).
  The <i>Q<sub>n</sub></i>-estimator requires <i>N**2/2</i> elements.
</p>

  <h2>Parameters</h2>

  <h3>On input:</h3>
  <dl>
    <dt>data</dt><dd> array of data</dd>
  </dl>

  <h3>On output:</h3>
  <dl>
    <dt>mean</dt><dd> mean by median</dd>
    <dt>stderr</dt><dd>(optional) standard error</dd>
    <dt>stdsig</dt><dd>(optional) standard deviation by MAD</dd>
    <dt>verbose </dt><dd> (optional) print a detailed information
      about the calulations.</dd></dt>
    <dt>flag</dt><dd>(optional) a flag,
      see <a href="status.html">the status codes</a>.</dd>
  </dl>

  <h2>Example</h2>

  <p>
    Save the program to the file <samp>example.f08</samp>:
  </p>
  <pre>
program example

   use oakleaf

   real, dimension(5) :: data = [ 1, 2, 3, 4, 5 ]
   real :: mean,stderr,stdsig

   call qmean(data,mean,stderr,stdsig)
   write(*,*) 'qmean:',mean,' stderr:',stderr,'stdsig:',stdsig

end program example
  </pre>
  <p>
    Than compile and run it:
  </p>
  <pre>
$ gfortran -I/usr/include example.f08 -L/usr/lib -loakleaf -lminpack
$ ./a.out
 qmean:   3.00000000      stderr:  0.992554188     stdsig:   2.21941876
  </pre>

<!--
  <h2>Median and MAD estimates</h2>

  <p>This routine estimates the averadge by
    <a href="https://en.wikipedia.org/wiki/Median">median</a>
    and the standard deviation by the median of absolute deviations (MAD).
  </p>

  <pre>
    subroutine <span class="subroutine">madmed(x,t,s)</span>
    subroutine <span class="subroutine">madwmed(x,dx,t,s,d)</span>

       real(kind), dimension(:),intent(in) :: x, dx
       real(kind), intent(out) :: t,s
    end subroutine qmean

 <b>On input:</b>
      x - array of data values to be estimated
      dx (optional) - array of the data errors

 <b>On output</b> are estimated:
      t - median
      s - standard deviation as MAD/Q50

  </pre>

  -->

<!--
<p style="margin:2em; text-align: center;">☘</p>

<h1>Auxiliary tools</h1>

    <h2>Empirical CDF</h2>

<p>
        The following routine construct	the
	<a href="https://en.wikipedia.org/wiki/Empirical_distribution_function">
	  Empirical cumulative distribution function</a> (CDF)
	from data.</p>

  <pre>
subroutine <span class="subroutine">ecdf(data,x,p)</span>
    real, dimension(:), intent(in) :: data
    real, dimension(:), intent(out) :: x,p
end subroutine ecdf
  </pre>

  <h3>On input:</h3>
  <dl>
    <dt>data</dt><dd> an array of data values to be estimated</dd>
  </dl>

  <h3>On output:</h3>
  <dl>
    <dt>x</dt><dd> coordinates of absicca of steps</dd>
    <dt>p</dt><dd> step values</dd>
  </dl>

    <h2>Quantiles</h2>

  <p>The estimates <a href="https://en.wikipedia.org/wiki/Quantile_function">
    Q-quantiles</a> by a linear interpolation of empirical CDF
  between discrete CDF steps.
  This is the inverse function of the empirical CDF.
  </p>

  <pre>
function <span class="subroutine">quantile(q,x,y) result(t)</span>
   real, intent(in) :: q
   real, dimension(:), intent(in) :: x,y
end function quantile
  </pre>

  <h3>On input:</h3>
  <dl>
    <dt>q</dt><dd> quantile as fraction (0 &le; q &le; 1)</dd>
    <dt>x,y</dt><dd> the empirical CDF</dd>
  </dl>

  <h3>Result:</h3>
  <p>
  The lineary interpolated value of <i>x</i> abscicca for the given quantile.
  </p>

  <h2>An example</h3>
  <p>
    Save following program listing to the file <samp>example.f08</samp>:
  </p>
  <pre>
program example

   use oakleaf

   real, dimension(5) :: data = [ 1, 2, 3, 4, 5 ]
   real, dimension(5) :: xcf, ycdf

   call ecdf(data,xcdf,ycdf)
   write(*,*) 'Q(0.75):',quantile(0.75,xcdf,ycdf)

end program example
  </pre>
  <p>
    Than compile and run it:
  </p>
  <pre>
$ gfortran -I$HOME/scratch/opt/include q.f08 -L$HOME/scratch/opt/lib -loakleaf -lminpack
$ ./a.out
 Q(0.75):   4.500000000
  </pre>

-->

  <h2>References</h2>

<p>
  Wirth,D.: Algorithm and data structures,
  Prentice-Hall International editions (1986)
</p>

<p>Rousseeuw, Peter J and Croux, Christophe:
  Alternatives to the median absolute deviation,
  Journal of the American Statistical association vol. 88, 1273--1283 (1993)
</p>


<!-- #include "foot.shtml" -->
</body>
</html>