File: info.py

package info (click to toggle)
python-scipy 0.5.2-0.1
  • links: PTS
  • area: main
  • in suites: etch, etch-m68k
  • size: 33,888 kB
  • ctags: 44,231
  • sloc: ansic: 156,256; cpp: 90,347; python: 89,604; fortran: 73,083; sh: 1,318; objc: 424; makefile: 342
file content (66 lines) | stat: -rw-r--r-- 2,904 bytes parent folder | download
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
"""
Routines for Gaussian Mixture Models
and learning with Expectation Maximization 
==========================================

This module contains classes and function to compute multivariate Gaussian densities
(diagonal and full covariance matrices), Gaussian mixtures, Gaussian mixtures models
and an Em trainer.

More specifically, the module defines the following classes, functions:

- densities.gauss_den: function to compute multivariate Gaussian pdf 
- gauss_mix.GM: defines the GM (Gaussian Mixture) class. A Gaussian Mixture can be
created from its parameters weights, mean and variances, or from its meta parameters
d (dimension of the Gaussian) and k (number of components in the mixture). A Gaussian
Model can then be sampled or plot (if d>1, plot confidence ellipsoids projected on 
2 chosen dimensions, if d == 1, plot the pdf of each component and fill the zone
of confidence for a given level)
- gmm_em.GMM: defines a class GMM (Gaussian Mixture Model). This class is constructed
from a GM model gm, and can be used to train gm. The GMM can be initiated by
kmean or at random, and can compute sufficient statistics, and update its parameters
from the sufficient statistics.
- kmean.kmean: implements a kmean algorithm. We cannot use scipy.cluster.vq kmeans, since
its does not give membership of observations.

Example of use: 
    #++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
    # Create an artificial 2 dimension, 3 clusters GM model, samples it
    #++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
    w, mu, va   = GM.gen_param(2, 3, 'diag', spread = 1.5)
    gm          = GM.fromvalues(w, mu, va)

    # Sample 1000 frames  from the model
    data    = gm.sample(1000)

    #++++++++++++++++++++++++
    # Learn the model with EM
    #++++++++++++++++++++++++
    # Init the model
    lgm = GM(d, k, mode)
    gmm = GMM(lgm, 'kmean')

    # The actual EM, with likelihood computation. The threshold
    # is compared to the (linearly appromixated) derivative of the likelihood
    em      = EM()
    like    = em.train(data, gmm, maxiter = 30, thresh = 1e-8)

Files example.py and example2.py show more capabilities of the toolbox, including
plotting capabilities (using matplotlib) and model selection using Bayesian 
Information Criterion (BIC).

Bibliography:
    * Maximum likelihood from incomplete data via the EM algorithm in Journal of 
    the Royal Statistical Society, Series B, 39(1):1--38, 1977, by A. P. Dempster, 
    N. M. Laird, and D. B. Rubin
    * Bayesian Approaches to Gaussian Mixture Modelling (1998) by 
    Stephen J. Roberts, Dirk Husmeier, Iead Rezek, William Penny in 
    IEEE Transactions on Pattern Analysis and Machine Intelligence
     
Copyright: David Cournapeau 2006
License: BSD-style (see LICENSE.txt in main source directory)
"""
version = '0.5.6'

depends = ['linalg', 'stats']
ignore  = False