File: statisticscommand.htm

package info (click to toggle)
extrema 4.3.6-1
  • links: PTS
  • area: main
  • in suites: lenny
  • size: 19,212 kB
  • ctags: 6,452
  • sloc: cpp: 86,428; sh: 8,229; makefile: 814
file content (310 lines) | stat: -rw-r--r-- 12,822 bytes parent folder | download
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
<HTML>
<HEAD>
<TITLE>STATISTICS command</TITLE>
</HEAD>
<BODY BGCOLOR="#FFFFFF" TEXT="#000000">

<p><font size="+3" color="green"><B>STATISTICS command</B></font></P>

<TABLE border="1" cols="2" frame="box" rules="all" width="572">
<TR>
<TD width="15%" valign="top"><B>Syntax</B>:</TD>
<TD width="85%" valign="top"><CODE>
STATISTICS x { s1\keyword { s2\keyword ... }}<br />
STATISTICS\PEARSON x y { rcof prob }<br />
STATISTICS\MOMENTS w x n { sout }</CODE>
</TD></TR>
<TR>
<TD valign="top"><B>Qualifiers</B>:</TD>
<TD valign="top"><CODE>\MESSAGES, \WEIGHTS, \MOMENTS, \PEARSON</CODE></TD></TR>
<TR>
<TD valign="top"><B>Defaults</B>:</TD>
<TD valign="top"><CODE>\MESSAGES, \-WEIGHTS</CODE></TD></TR>
<TR>
<TD valign="top"><B>Examples</B>:</TD>
<TD valign="top"><CODE>
STATISTICS X<br />
STATISTICS\-MESS X XMED\MEDIAN XMEAN\XMEAN<BR />
STATISTICS\WEIGHTS W X XVAR\VARIANCE XSUM\SUM<BR />
STATISTICS\MOMENTS Y X 3 M3</CODE>
</TD></TR>
</TABLE>
<P>
 The <CODE>STATISTICS</CODE> command calculates various statistics
 for the input variable <CODE>x</CODE>, which can be
 a vector or a matrix. Specific statistics are chosen with qualifier keywords
 which are appended to the output parameters with the backslash, \. All
 vectors must be the same size.</P>
<P>
 Table 1 below shows the parameter qualifier keywords and corresponding output values for extrema.
 Table 2 shows the parameter qualifier keywords and corresponding output values for central measures.
 Table 3 shows the parameter qualifier keywords and corresponding output values for dispersion and
 skewness.</p>
<p>
 <center><table border="1" width="400">
 <tr>
 <td><i>Keyword</i></td>
 <td><i>Output Value</i></td>
 </tr><tr>
 <td><CODE>\MAX</CODE></td>
 <td>maximum value of <CODE>x</CODE></td>
 </tr><tr>
 <td><CODE>\IMAX</CODE></td>
 <td>index of the maximum if <CODE>x</CODE> is a vector<br />
  row index of the maximum if <CODE>x</CODE> is a matrix</td>
 </tr><tr>
 <td><CODE>\JMAX</CODE></td>
 <td>column index of the maximum if <CODE>x is a matrix</CODE></td>
 </tr><tr>
 <td><CODE>\MIN</CODE></td>
 <td>minimum value of <CODE>x</CODE></td>
 </tr><tr>
 <td><CODE>\IMIN</CODE></td>
 <td>index of the minimum if <CODE>x</CODE> is a vector<br />
  row index of the minimum if <CODE>x</CODE> is a matrix</td>
 </tr><tr>
 <td><CODE>\JMIN</CODE></td>
 <td>column index of the minimum value if <CODE>x</CODE> is a matrix</td>
 </tr></table>
 <table width="400" border="0">
 <tr><td align="center"><b>Table 1:</b> Extrema keywords</td>
 </tr></table></center></p>
<p>
 <center><table border="1" width="400">
 <tr>
 <td><i>Keyword</i></td><td><i>Output Value</i></td>
 </tr><tr>
 <td><CODE>\SUM</CODE></td><td>arithmetic sum (unweighted)</td>
 </tr><tr>
 <td><CODE>\MEAN</CODE></td><td>arithmetic mean</td>
 </tr><tr>
 <td><CODE>\GMEAN</CODE></td><td>geometric mean</td>
 </tr><tr>
 <td><CODE>\MEDIAN</CODE></td><td>median value</td>
 </tr><tr>
 <td><CODE>\RMS</CODE></td><td>root-mean-square</td>
 </tr></table>
 <table width="400" border="0">
 <tr><td align="center"><b>Table 2:</b> Central measure keywords</td>
 </td></table></center></p>
<p>
 <center><table border="1" width="400">
 <tr>
 <td><i>Keyword</i></td><td><i>Output Value</i></td>
 </tr><tr>
 <td><CODE>\VARIANCE</CODE></td><td>variance</td>
 </tr><tr>
 <td><CODE>\SDEV</CODE></td><td>standard deviation</td>
 </tr><tr>
 <td><CODE>\ADEV</CODE></td><td>average deviation</td>
 </tr><tr>
 <td><CODE>\KURTOSIS</CODE></td><td>kurtosis</td>
 </tr><tr>
 <td><CODE>\SKEWNESS</CODE></td><td>skewness</td>
 </tr></table>
 <table width="400" border="0">
 <tr><td align="center"><b>Table 3:</b> Dispersion and skewness keywords</td>
 </tr></table></center></p>
<p>
 <font size="+2" color="green">Informational messages</font></p>
<p>
 The default is to display all the calculated statistics. If the
 <CODE>\-MESSAGES</CODE> command qualifier is used, and if at least one output scalar is entered,
 then the statistics values will not be displayed.</p>
<p>
 <font size="+2" color="green">Weights</font></p>
<p>
 <TABLE border="1" cols="2" frame="box" rules="all" width="572">
 <TR>
 <TD width="15%" valign="top"><B>Syntax</B>:</TD>
 <TD width="85%" valign="top"><CODE>
 STATISTICS\WEIGHTS w x { s1\keyword { s2\keyword ... }}</CODE>
 </TD></TR></TABLE></p>
<p>
 You <EM>must</EM> use the <CODE>\WEIGHTS</CODE>
 qualifier to indicate that a weight vector is present. Weights cannot be
 applied to matrix data.</p>
<P>
 A weighting factor, <CODE>w[i] &ge; 0</CODE>,
 could be the frequency, the probability, the mass, the reliability, or some
 other multiplier. The lengths of <CODE>w</CODE> and <CODE>x</CODE> must be equal.</p>
<p>
 <font size="+2" color="green">Definitions</font></p>
<p>
 Suppose that <code>x</code> is a vector with <code>N</code> elements.</P>
<P>
 If a weight vector, <code>w</code>, is entered, remember to use the
 <CODE>\WEIGHTS</CODE> command qualifier. The
 length of <code>w</code> is assumed to also be <code>N</code>. If no weights are entered,
 let <code>w<sub>i</sub></code> default to <CODE>1</CODE>, for <code>i = 1,2,...,N</code>.
 Define the total weight: <code>W = w<sub>1</sub> + w<sub>2</sub> + ... + w<sub>N</sub></code></p>
<P>
 <font size="+1" color="green">Sum</font></p>
<P>
 The sum is defined by <code>x<sub>1</sub> + x<sub>2</sub> + ... + x<sub>N</sub></code></p>
<P>
 <font size="+1" color="green">Mean value</font></p>
<P>
 The mean value, <code>M</code>, is defined by</p>
<p>
 <center><code>M = (1/W)*[w<sub>1</sub>x<sub>1</sub> + 
 w<sub>2</sub>x<sub>2</sub> + ... + w<sub>N</sub>x<sub>N</sub>]</code></center></p>
<P>
 <font size="+1" color="green">Geometric mean</font></p>
<P>
 The geometric mean, <code>G<sub>x</sub></code>, is defined if each <code>x<sub>i</sub> &ge; 0</code>
 by:</p>
<p>
 <center><code>G<sub>x</sub> = exp(1/W)*[w<sub>1</sub>log(x<sub>1</sub>) +
 w<sub>2</sub>log(x<sub>2</sub>) + ... +
 w<sub>N</sub>log(x<sub>N</sub>)]</code></center></p> 
<P>
 <font size="+1" color="green">Median</font></p>
<P>
 The median is the element of <code>x</code> which has equal numbers of values above
 it and below it. If <code>N</code> is even, the median is the average of the unique
 two central values.</p>
<P>
 <font size="+1" color="green">Root-mean-square</font></p>
<P>
 The root-mean-square, <code>RMS</code>, is defined by</p>
<p>
 <center><code>RMS = sqrt([1/W]*[w<sub>1</sub>x<sub>1</sub><sup>2</sup> +
 w<sub>2</sub>x<sub>2</sub><sup>2</sup>
 + ... + w<sub>N</sub>x<sub>N</sub><sup>2</sup>])</code></center></p>
<P>
 <font size="+1" color="green">Variance</font></p>
<P>
 The variance, <code>&mu;</code>, is defined by</p>
<p>
 <center><code>&mu; = [N/W(N-1)]*[w<sub>1</sub>(x<sub>1</sub>-M)<sup>2</sup> + 
 w<sub>2</sub>(x<sub>2</sub>-M)<sup>2</sup> + ... +
 w<sub>N</sub>(x<sub>N</sub>-M)<sup>2</sup>]</code></center></p>
<P>
 <font size="+1" color="green">Standard deviation</font></p>
<P>
 The standard deviation, <code>&sigma;</code>, is defined by <code>&sigma; = sqrt(&mu;)</code></p>
<P>
 <font size="+1" color="green">Average deviation</font></p>
<P>
 The average deviation, or mean deviation, <code>&delta;</code>, is defined by</p>
<p>
 <center><code>&delta; = (1/W)*[w<sub>1</sub>|x<sub>1</sub>-M| + w<sub>2</sub>|x<sub>2</sub>-M| + ... +
 w<sub>N</sub>|x<sub>N</sub>-M|]</code></center></p>
<P>
 <font size="+1" color="green">Skewness</font></p>
<P>
 The skewness, or third moment, <code>skew</code>, is a nondimensional quantity that
 characterizes the degree of asymmetry of a distribution around its mean. The
 skewness is a pure number that characterizes only the shape of the
 distribution, and is defined by</p>
<p>
 <center><code>skew = (1/W)*{w<sub>1</sub>[(x<sub>1</sub>-M)/&sigma;]<sup>3</sup> + 
 w<sub>2</sub>[(x<sub>2</sub>-M)/&sigma;]<sup>3</sup> + ... +
 w<sub>N</sub>[(x<sub>N</sub>-M)/&sigma;]<sup>3</sup>}</code></center></p>
<P>
 A positive value of skewness signifies a distribution with an asymmetric tail
 extending out towards more positive <i>x</i>; a negative value signifies a
 distribution whose tail extends out towards more negative <i>x</i>.</p>
<P>
 <font size="+1" color="green">Kurtosis</font></p>
<P>
 The kurtosis, <code>kurt</code>, is a nondimensional quantity which measures the
 relative peakedness or flatness of a distribution, relative to a normal
 distribution. A distribution with positive kurtosis is termed leptokurtic;
 a distribution with negative kurtosis is termed platykurtic. An in-between
 distribution is termed mesokurtic. The kurtosis is defined by</p>
<p>
 <center><code>kurt = 
 w<sub>1</sub>[(x<sub>1</sub>-M)/&sigma;]<sup>4</sup> +
 w<sub>2</sub>[(x<sub>2</sub>-M)/&sigma;]<sup>4</sup> + ... +
 w<sub>N</sub>[(x<sub>N</sub>-M)/&sigma;]<sup>4</sup> - 3</code></center></P>
<P>
 where the <i>-3</i> term makes the value zero for a normal distribution.</p>
<p>
 <font size="+2" color="green">Moments</font></p>
<TABLE border="1" cols="2" frame="box" rules="all" width="572">
<TR>
<TD width="15%" valign="top"><B>Syntax</B>:</TD>
<TD width="85%" valign="top"><CODE>
STATISTICS\MOMENTS w x n { s }</CODE>
</TD></TR></TABLE>
<p>
 If the <CODE>\MOMENTS</CODE> command qualifier is used, the <CODE>n</CODE><sup>th</sup>
 moment of vector <CODE>x</CODE>, with weight <CODE>w</CODE>, is calculated and optionally
 stored in output scalar <CODE>s</CODE>. The moment number, <CODE>n</CODE>, can be any integer
 <code>&gt; 0</code>.</p>
<P>
 <center><code>s = (1/W)*[w<sub>1</sub>x<sub>1</sub><sup>n</sup> +
 w<sub>2</sub>x<sub>2</sub><sup>n</sup> + ... +
 w<sub>N</sub>x<sub>N</sub><sup>n</sup>]</code></center></p>
<P>
<font size="+2" color="green">Linear correlation coefficient</font></p>
<TABLE border="1" cols="2" frame="box" rules="all" width="572">
<TR>
<TD width="15%" valign="top"><B>Syntax</B>:</TD>
<TD width="85%" valign="top"><CODE>
STATISTICS\PEARSON x y { r p }</CODE>
</TD></TR></TABLE>
<p>
 Pearson's <code>r</code>, or the linear correlation coefficient, is widely used as
 a measure of association between variables that are continuous.  For pairs
 of quantities <code>(x<sub>i</sub>,y<sub>i</sub>)</code>, for <code>i = 1,2,...,N</code>, the
 linear correlation coefficient <code>r</code> is given by the formula:</p>
<P>
 <IMG SRC="img33.gif"></P>
<P>
 where &nbsp;<IMG SRC="img12.gif">&nbsp; is the mean of <code>x</code>, and
 &nbsp;<IMG SRC="img35.gif">&nbsp; is the mean of <code>y</code>.</p>
<P>
 The value of <i>r</i> lies between <i>-1</i> and <i>+1</i>, inclusive. It
 takes on a value of <i>+1</i> when the data points lie on a straight line
 with positive slope, <code>x</code> and <code>y</code> increase together. The value
 <i>+1</i> holds independent of the magnitude of this slope. If the data
 points lie on a straight line with negative slope, <code>y</code> decreases as
 <code>x</code> increases, then <code>r</code> has the value <i>-1</i>. A value of
 <code>r</code> near zero indicates that the variables <code>x</code> and <code>y</code> are
 uncorrelated.</p>
<P>
 <code>r</code> is a way of summarizing the strength of a correlation which is
 known to be significant, but it is a poor statistic for deciding whether an
 observed correlation is statistically significant, and/or whether one observed
 correlation is significantly stronger than another. The reason is that
 <code>r</code> is ignorant of the individual distributions of <code>x</code> and
 <code>y</code>, so there is no universal way to compute its distribution in the
 case of the null hypothesis.</p>
<P>
 The <CODE>STATISTICS\PEARSON</CODE> command returns Pearson's <code>r</code> in the scalar variable
 <CODE>r</CODE>. It also returns scalar <CODE>p</CODE>, the significance
 level at which the null hypothesis of zero correlation is disproved.
 A small value of <CODE>p</CODE> indicates a significant correlation.</p>
<P>
 <IMG SRC="img37.gif"></P>
<P>
 where <code>I</code> is the incomplete Beta function and <code>t</code> is defined by:</p>
<p> 
 <center><IMG SRC="img39.gif"></center></P>
<P>
 <font size="+1" color="green">Examples</font></p>
<p>
 Suppose you have a vector <code>X=[1.2;2.1;3.2;4.5;5;6;7]</code>. Entering
 <code><font color="blue">STATISTICS X</font></code> produces the following display:</P>
<p>
 <IMG SRC="ex1.png"></p>
<p>
 If you want to use the values for the maximum, minimum and mean of <TT>X</TT>,
 enter:</p>
<P>
 <code><font color="blue">STATISTICS X XMEAN\MEAN XMIN\MIN XMAX\MAX</font></code></p>
<P>
 and you will have the scalars: <code>XMAX=7</code>, <code>XMIN=1.2</code>, and
 <code>XMEAN=4.142857</code></p>
<P>
 If you also want the index values for the maximum and the minimum of
 <TT>X</TT>, enter:</p>
<P>
 <code><font color="blue">STATISTICS X XMEAN\MEAN XMIN\MIN XMAX\MAX IMX\IMAX IMN\IMIN</font></code></p>
<P>
 and you will also have scalars: <code>IMX=7</code> and <code>IMN=1</code>.</p>
</BODY>
</HTML>