1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131
|
stats.py module
(Requires pstat.py module.)
#################################################
####### Written by: Gary Strangman ###########
####### Last modified: Dec 28, 2000 ###########
#################################################
A collection of basic statistical functions for python. The function
names appear below.
IMPORTANT: There are really *3* sets of functions. The first set has an 'l'
prefix, which can be used with list or tuple arguments. The second set has
an 'a' prefix, which can accept NumPy array arguments. These latter
functions are defined only when NumPy is available on the system. The third
type has NO prefix (i.e., has the name that appears below). Functions of
this set are members of a "Dispatch" class, c/o David Ascher. This class
allows different functions to be called depending on the type of the passed
arguments. Thus, stats.mean is a member of the Dispatch class and
stats.mean(range(20)) will call stats.lmean(range(20)) while
stats.mean(Numeric.arange(20)) will call stats.amean(Numeric.arange(20)).
This is a handy way to keep consistent function names when different
argument types require different functions to be called. Having
implementated the Dispatch class, however, means that to get info on
a given function, you must use the REAL function name ... that is
"print stats.lmean.__doc__" or "print stats.amean.__doc__" work fine,
while "print stats.mean.__doc__" will print the doc for the Dispatch
class. NUMPY FUNCTIONS ('a' prefix) generally have more argument options
but should otherwise be consistent with the corresponding list functions.
Disclaimers: The function list is obviously incomplete and, worse, the
functions are not optimized. All functions have been tested (some more
so than others), but they are far from bulletproof. Thus, as with any
free software, no warranty or guarantee is expressed or implied. :-) A
few extra functions that don't appear in the list below can be found by
interested treasure-hunters. These functions don't necessarily have
both list and array versions but were deemed useful
CENTRAL TENDENCY: geometricmean
harmonicmean
mean
median
medianscore
mode
MOMENTS: moment
variation
skew
kurtosis
skewtest (for Numpy arrays only)
kurtosistest (for Numpy arrays only)
normaltest (for Numpy arrays only)
ALTERED VERSIONS: tmean (for Numpy arrays only)
tvar (for Numpy arrays only)
tmin (for Numpy arrays only)
tmax (for Numpy arrays only)
tstdev (for Numpy arrays only)
tsem (for Numpy arrays only)
describe
FREQUENCY STATS: itemfreq
scoreatpercentile
percentileofscore
histogram
cumfreq
relfreq
VARIABILITY: obrientransform
samplevar
samplestdev
signaltonoise (for Numpy arrays only)
var
stdev
sterr
sem
z
zs
zmap (for Numpy arrays only)
TRIMMING FCNS: threshold (for Numpy arrays only)
trimboth
trim1
round (round all vals to 'n' decimals; Numpy only)
CORRELATION FCNS: covariance (for Numpy arrays only)
correlation (for Numpy arrays only)
paired
pearsonr
spearmanr
pointbiserialr
kendalltau
linregress
INFERENTIAL STATS: ttest_1samp
ttest_ind
ttest_rel
chisquare
ks_2samp
mannwhitneyu
ranksums
wilcoxont
kruskalwallish
friedmanchisquare
PROBABILITY CALCS: chisqprob
erfcc
zprob
ksprob
fprob
betacf
gammln
betai
ANOVA FUNCTIONS: F_oneway
F_value
SUPPORT FUNCTIONS: writecc
incr
sign (for Numpy arrays only)
sum
cumsum
ss
summult
sumdiffsquared
square_of_sums
shellsort
rankdata
outputpairedstats
findwithin
|