1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 598 599 600 601 602 603 604 605 606 607 608 609 610 611 612 613 614 615 616 617 618 619 620 621 622 623 624 625 626 627 628 629 630 631 632 633 634 635 636 637 638 639 640 641 642 643 644 645 646 647 648 649 650 651 652 653 654 655 656 657 658 659 660 661 662 663 664 665 666 667 668 669 670 671 672 673 674 675 676 677 678 679 680 681 682 683 684 685 686 687 688 689 690 691 692 693 694 695 696 697 698 699 700 701 702 703 704 705 706 707 708 709 710 711 712 713 714 715 716 717 718 719 720 721 722 723 724 725 726 727 728 729 730 731 732 733 734 735 736 737 738 739 740 741 742 743 744 745 746 747 748 749 750 751 752 753 754 755 756 757 758 759 760 761 762 763 764 765 766 767
|
.. module:: statsmodels.stats
:synopsis: Statistical methods and tests
.. currentmodule:: statsmodels.stats
.. _stats:
Statistics :mod:`stats`
=======================
This section collects various statistical tests and tools.
Some can be used independently of any models, some are intended as extension to the
models and model results.
API Warning: The functions and objects in this category are spread out in
various modules and might still be moved around. We expect that in future the
statistical tests will return class instances with more informative reporting
instead of only the raw numbers.
.. _stattools:
Residual Diagnostics and Specification Tests
--------------------------------------------
.. module:: statsmodels.stats.stattools
:synopsis: Statistical methods and tests that do not fit into other categories
.. currentmodule:: statsmodels.stats.stattools
.. autosummary::
:toctree: generated/
durbin_watson
jarque_bera
omni_normtest
medcouple
robust_skewness
robust_kurtosis
expected_robust_kurtosis
.. module:: statsmodels.stats.diagnostic
:synopsis: Statistical methods and tests to diagnose model fit problems
.. currentmodule:: statsmodels.stats.diagnostic
.. autosummary::
:toctree: generated/
acorr_breusch_godfrey
acorr_ljungbox
acorr_lm
breaks_cusumolsresid
breaks_hansen
recursive_olsresiduals
compare_cox
compare_encompassing
compare_j
het_arch
het_breuschpagan
het_goldfeldquandt
het_white
spec_white
linear_harvey_collier
linear_lm
linear_rainbow
linear_reset
Outliers and influence measures
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
.. module:: statsmodels.stats.outliers_influence
:synopsis: Statistical methods and measures for outliers and influence
.. currentmodule:: statsmodels.stats.outliers_influence
.. autosummary::
:toctree: generated/
OLSInfluence
GLMInfluence
MLEInfluence
variance_inflation_factor
See also the notes on :ref:`notes on regression diagnostics <diagnostics>`
Sandwich Robust Covariances
---------------------------
The following functions calculate covariance matrices and standard errors for
the parameter estimates that are robust to heteroscedasticity and
autocorrelation in the errors. Similar to the methods that are available
for the LinearModelResults, these methods are designed for use with OLS.
.. currentmodule:: statsmodels.stats
.. autosummary::
:toctree: generated/
sandwich_covariance.cov_hac
sandwich_covariance.cov_nw_panel
sandwich_covariance.cov_nw_groupsum
sandwich_covariance.cov_cluster
sandwich_covariance.cov_cluster_2groups
sandwich_covariance.cov_white_simple
The following are standalone versions of the heteroscedasticity robust
standard errors attached to LinearModelResults
.. autosummary::
:toctree: generated/
sandwich_covariance.cov_hc0
sandwich_covariance.cov_hc1
sandwich_covariance.cov_hc2
sandwich_covariance.cov_hc3
sandwich_covariance.se_cov
Goodness of Fit Tests and Measures
----------------------------------
some tests for goodness of fit for univariate distributions
.. module:: statsmodels.stats.gof
:synopsis: Goodness of fit measures and tests
.. currentmodule:: statsmodels.stats.gof
.. autosummary::
:toctree: generated/
powerdiscrepancy
gof_chisquare_discrete
gof_binning_discrete
chisquare_effectsize
.. currentmodule:: statsmodels.stats.diagnostic
.. autosummary::
:toctree: generated/
anderson_statistic
normal_ad
kstest_exponential
kstest_fit
kstest_normal
lilliefors
Non-Parametric Tests
--------------------
.. module:: statsmodels.sandbox.stats.runs
:synopsis: Experimental statistical methods and tests to analyze runs
.. currentmodule:: statsmodels.sandbox.stats.runs
.. autosummary::
:toctree: generated/
mcnemar
symmetry_bowker
median_test_ksample
runstest_1samp
runstest_2samp
cochrans_q
Runs
.. currentmodule:: statsmodels.stats.descriptivestats
.. autosummary::
:toctree: generated/
sign_test
.. currentmodule:: statsmodels.stats.nonparametric
.. autosummary::
:toctree: generated/
rank_compare_2indep
rank_compare_2ordinal
RankCompareResult
cohensd2problarger
prob_larger_continuous
rankdata_2samp
Descriptive Statistics
----------------------
.. module:: statsmodels.stats.descriptivestats
:synopsis: Descriptive statistics
.. currentmodule:: statsmodels.stats.descriptivestats
.. autosummary::
:toctree: generated/
describe
Description
.. _interrater:
Interrater Reliability and Agreement
------------------------------------
The main function that statsmodels has currently available for interrater
agreement measures and tests is Cohen's Kappa. Fleiss' Kappa is currently
only implemented as a measures but without associated results statistics.
.. module:: statsmodels.stats.inter_rater
.. currentmodule:: statsmodels.stats.inter_rater
.. autosummary::
:toctree: generated/
cohens_kappa
fleiss_kappa
to_table
aggregate_raters
Multiple Tests and Multiple Comparison Procedures
-------------------------------------------------
`multipletests` is a function for p-value correction, which also includes p-value
correction based on fdr in `fdrcorrection`.
`tukeyhsd` performs simultaneous testing for the comparison of (independent) means.
These three functions are verified.
GroupsStats and MultiComparison are convenience classes to multiple comparisons similar
to one way ANOVA, but still in development
.. module:: statsmodels.sandbox.stats.multicomp
:synopsis: Experimental methods for controlling size while performing multiple comparisons
.. currentmodule:: statsmodels.stats.multitest
.. autosummary::
:toctree: generated/
multipletests
fdrcorrection
.. currentmodule:: statsmodels.sandbox.stats.multicomp
.. autosummary::
:toctree: generated/
GroupsStats
MultiComparison
TukeyHSDResults
.. module:: statsmodels.stats.multicomp
:synopsis: Methods for controlling size while performing multiple comparisons
.. currentmodule:: statsmodels.stats.multicomp
.. autosummary::
:toctree: generated/
pairwise_tukeyhsd
.. module:: statsmodels.stats.multitest
:synopsis: Multiple testing p-value and FDR adjustments
.. currentmodule:: statsmodels.stats.multitest
.. autosummary::
:toctree: generated/
local_fdr
fdrcorrection_twostage
NullDistribution
RegressionFDR
.. module:: statsmodels.stats.knockoff_regeffects
:synopsis: Regression Knock-Off Effects
.. currentmodule:: statsmodels.stats.knockoff_regeffects
.. autosummary::
:toctree: generated/
CorrelationEffects
OLSEffects
ForwardEffects
OLSEffects
RegModelEffects
The following functions are not (yet) public
.. currentmodule:: statsmodels.sandbox.stats.multicomp
.. autosummary::
:toctree: generated/
varcorrection_pairs_unbalanced
varcorrection_pairs_unequal
varcorrection_unbalanced
varcorrection_unequal
StepDown
catstack
ccols
compare_ordered
distance_st_range
ecdf
get_tukeyQcrit
homogeneous_subsets
maxzero
maxzerodown
mcfdr
qcrit
randmvn
rankdata
rejectionline
set_partition
set_remove_subs
tiecorrect
.. _tost:
Basic Statistics and t-Tests with frequency weights
---------------------------------------------------
Besides basic statistics, like mean, variance, covariance and correlation for
data with case weights, the classes here provide one and two sample tests
for means. The t-tests have more options than those in scipy.stats, but are
more restrictive in the shape of the arrays. Confidence intervals for means
are provided based on the same assumptions as the t-tests.
Additionally, tests for equivalence of means are available for one sample and
for two, either paired or independent, samples. These tests are based on TOST,
two one-sided tests, which have as null hypothesis that the means are not
"close" to each other.
.. module:: statsmodels.stats.weightstats
:synopsis: Weighted statistics
.. currentmodule:: statsmodels.stats.weightstats
.. autosummary::
:toctree: generated/
DescrStatsW
CompareMeans
ttest_ind
ttost_ind
ttost_paired
ztest
ztost
zconfint
weightstats also contains tests and confidence intervals based on summary
data
.. currentmodule:: statsmodels.stats.weightstats
.. autosummary::
:toctree: generated/
_tconfint_generic
_tstat_generic
_zconfint_generic
_zstat_generic
_zstat_generic2
Power and Sample Size Calculations
----------------------------------
The :mod:`power` module currently implements power and sample size calculations
for the t-tests, normal based test, F-tests and Chisquare goodness of fit test.
The implementation is class based, but the module also provides
three shortcut functions, ``tt_solve_power``, ``tt_ind_solve_power`` and
``zt_ind_solve_power`` to solve for any one of the parameters of the power
equations.
.. module:: statsmodels.stats.power
:synopsis: Power and size calculations for common tests
.. currentmodule:: statsmodels.stats.power
.. autosummary::
:toctree: generated/
TTestIndPower
TTestPower
GofChisquarePower
NormalIndPower
FTestAnovaPower
FTestPower
normal_power_het
normal_sample_size_one_tail
tt_solve_power
tt_ind_solve_power
zt_ind_solve_power
.. _proportion_stats:
Proportion
----------
Also available are hypothesis test, confidence intervals and effect size for
proportions that can be used with NormalIndPower.
.. module:: statsmodels.stats.proportion
:synopsis: Tests for proportions
.. currentmodule:: statsmodels.stats.proportion
.. autosummary::
:toctree: generated
proportion_confint
proportion_effectsize
binom_test
binom_test_reject_interval
binom_tost
binom_tost_reject_interval
multinomial_proportions_confint
proportions_ztest
proportions_ztost
proportions_chisquare
proportions_chisquare_allpairs
proportions_chisquare_pairscontrol
proportion_effectsize
power_binom_tost
power_ztost_prop
samplesize_confint_proportion
Statistics for two independent samples
Status: experimental, API might change, added in 0.12
.. autosummary::
:toctree: generated
test_proportions_2indep
confint_proportions_2indep
power_proportions_2indep
tost_proportions_2indep
samplesize_proportions_2indep_onetail
score_test_proportions_2indep
_score_confint_inversion
Rates
-----
Statistical functions for rates. This currently includes hypothesis tests for
two independent samples.
See also example notebook for an overview
`Poisson Rates <examples/notebooks/generated/stats_poisson.ipynb>`_
Status: experimental, API might change, added in 0.12, refactored and enhanced
in 0.14
.. module:: statsmodels.stats.rates
:synopsis: Tests for Poisson rates
.. currentmodule:: statsmodels.stats.rates
statistical function for one sample
.. autosummary::
:toctree: generated
test_poisson
confint_poisson
confint_quantile_poisson
tolerance_int_poisson
statistical function for two independent samples
.. autosummary::
:toctree: generated
test_poisson_2indep
etest_poisson_2indep
confint_poisson_2indep
tost_poisson_2indep
nonequivalence_poisson_2indep
functions for statistical power
.. autosummary::
:toctree: generated
power_poisson_ratio_2indep
power_equivalence_poisson_2indep
power_poisson_diff_2indep
power_negbin_ratio_2indep
power_equivalence_neginb_2indep
Multivariate
------------
Statistical functions for multivariate samples.
This includes hypothesis test and confidence intervals for mean of sample
of multivariate observations and hypothesis tests for the structure of a
covariance matrix.
Status: experimental, API might change, added in 0.12
.. module:: statsmodels.stats.multivariate
:synopsis: Statistical functions for multivariate samples.
.. currentmodule:: statsmodels.stats.multivariate
.. autosummary::
:toctree: generated
test_mvmean
confint_mvmean
confint_mvmean_fromstats
test_mvmean_2indep
test_cov
test_cov_blockdiagonal
test_cov_diagonal
test_cov_oneway
test_cov_spherical
.. _oneway_stats:
Oneway Anova
------------
Hypothesis test, confidence intervals and effect size for oneway analysis of
k samples.
Status: experimental, API might change, added in 0.12
.. module:: statsmodels.stats.oneway
:synopsis: Statistical functions for oneway analysis, Anova.
.. currentmodule:: statsmodels.stats.oneway
.. autosummary::
:toctree: generated
anova_oneway
anova_generic
equivalence_oneway
equivalence_oneway_generic
power_equivalence_oneway
_power_equivalence_oneway_emp
test_scale_oneway
equivalence_scale_oneway
confint_effectsize_oneway
confint_noncentrality
convert_effectsize_fsqu
effectsize_oneway
f2_to_wellek
fstat_to_wellek
wellek_to_f2
_fstat2effectsize
scale_transform
simulate_power_equivalence_oneway
.. _robust_stats:
Robust, Trimmed Statistics
--------------------------
Statistics for samples that are trimmed at a fixed fraction. This includes
class TrimmedMean for one sample statistics. It is used in `stats.oneway`
for trimmed "Yuen" Anova.
Status: experimental, API might change, added in 0.12
.. module:: statsmodels.stats.robust_compare
:synopsis: Trimmed sample statistics.
.. currentmodule:: statsmodels.stats.robust_compare
.. autosummary::
:toctree: generated
TrimmedMean
scale_transform
trim_mean
trimboth
Moment Helpers
--------------
When there are missing values, then it is possible that a correlation or
covariance matrix is not positive semi-definite. The following
functions can be used to find a correlation or covariance matrix that is
positive definite and close to the original matrix.
Additional functions estimate spatial covariance matrix and regularized
inverse covariance or precision matrix.
.. module:: statsmodels.stats.correlation_tools
:synopsis: Procedures for ensuring correlations are positive semi-definite
.. currentmodule:: statsmodels.stats.correlation_tools
.. autosummary::
:toctree: generated/
corr_clipped
corr_nearest
corr_nearest_factor
corr_thresholded
cov_nearest
cov_nearest_factor_homog
FactoredPSDMatrix
kernel_covariance
.. currentmodule:: statsmodels.stats.regularized_covariance
.. autosummary::
:toctree: generated/
RegularizedInvCovariance
These are utility functions to convert between central and non-central moments, skew,
kurtosis and cummulants.
.. module:: statsmodels.stats.moment_helpers
:synopsis: Tools for converting moments
.. currentmodule:: statsmodels.stats.moment_helpers
.. autosummary::
:toctree: generated/
cum2mc
mc2mnc
mc2mvsk
mnc2cum
mnc2mc
mnc2mvsk
mvsk2mc
mvsk2mnc
cov2corr
corr2cov
se_cov
Mediation Analysis
------------------
Mediation analysis focuses on the relationships among three key variables:
an 'outcome', a 'treatment', and a 'mediator'. Since mediation analysis is a
form of causal inference, there are several assumptions involved that are
difficult or impossible to verify. Ideally, mediation analysis is conducted in
the context of an experiment such as this one in which the treatment is
randomly assigned. It is also common for people to conduct mediation analyses
using observational data in which the treatment may be thought of as an
'exposure'. The assumptions behind mediation analysis are even more difficult
to verify in an observational setting.
.. module:: statsmodels.stats.mediation
:synopsis: Mediation analysis
.. currentmodule:: statsmodels.stats.mediation
.. autosummary::
:toctree: generated/
Mediation
MediationResults
Oaxaca-Blinder Decomposition
----------------------------
The Oaxaca-Blinder, or Blinder-Oaxaca as some call it, decomposition attempts to explain
gaps in means of groups. It uses the linear models of two given regression equations to
show what is explained by regression coefficients and known data and what is unexplained
using the same data. There are two types of Oaxaca-Blinder decompositions, the two-fold
and the three-fold, both of which can and are used in Economics Literature to discuss
differences in groups. This method helps classify discrimination or unobserved effects.
This function attempts to port the functionality of the oaxaca command in STATA to Python.
.. module:: statsmodels.stats.oaxaca
:synopsis: Oaxaca-Blinder Decomposition
.. currentmodule:: statsmodels.stats.oaxaca
.. autosummary::
:toctree: generated/
OaxacaBlinder
OaxacaResults
Distance Dependence Measures
----------------------------
Distance dependence measures and the Distance Covariance (dCov) test.
.. module:: statsmodels.stats.dist_dependence_measures
:synopsis: Distance Dependence Measures
.. currentmodule:: statsmodels.stats.dist_dependence_measures
.. autosummary::
:toctree: generated/
distance_covariance_test
distance_statistics
distance_correlation
distance_covariance
distance_variance
Meta-Analysis
-------------
Functions for basic meta-analysis of a collection of sample statistics.
Examples can be found in the notebook
* `Meta-Analysis <examples/notebooks/generated/metaanalysis1.ipynb>`_
Status: experimental, API might change, added in 0.12
.. module:: statsmodels.stats.meta_analysis
:synopsis: Meta-Analysis
.. currentmodule:: statsmodels.stats.meta_analysis
.. autosummary::
:toctree: generated/
combine_effects
effectsize_2proportions
effectsize_smd
CombineResults
The module also includes internal functions to compute random effects
variance.
.. autosummary::
:toctree: generated/
_fit_tau_iter_mm
_fit_tau_iterative
_fit_tau_mm
|