File: tutorial.rst

package info (click to toggle)
python-hmmlearn 0.3.2-3
  • links: PTS, VCS
  • area: main
  • in suites: forky, sid
  • size: 588 kB
  • sloc: python: 4,797; cpp: 321; makefile: 13
file content (243 lines) | stat: -rw-r--r-- 8,751 bytes parent folder | download | duplicates (2)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
.. _tutorial:

Tutorial
========

.. currentmodule:: hmmlearn

``hmmlearn`` implements the Hidden Markov Models (HMMs).
The HMM is a generative probabilistic model, in which a sequence of observable
:math:`\mathbf{X}` variables is generated by a sequence of internal hidden
states :math:`\mathbf{Z}`. The hidden states are not observed directly.
The transitions between hidden states are assumed to have the form of a
(first-order) Markov chain. They can be specified by the start probability
vector :math:`\boldsymbol{\pi}` and a transition probability matrix
:math:`\mathbf{A}`. The emission probability of an observable can be any
distribution with parameters :math:`\boldsymbol{\theta}` conditioned on the
current hidden state. The HMM is completely determined by
:math:`\boldsymbol{\pi}`, :math:`\mathbf{A}` and :math:`\boldsymbol{\theta}`.

There are three fundamental problems for HMMs:

* Given the model parameters and observed data, estimate the optimal
  sequence of hidden states.
* Given the model parameters and observed data, calculate the model likelihood.
* Given just the observed data, estimate the model parameters.

The first and the second problem can be solved by the dynamic programming
algorithms known as
the Viterbi algorithm and the Forward-Backward algorithm, respectively.
The last one can be solved by an iterative Expectation-Maximization (EM)
algorithm, known as the Baum-Welch algorithm.

.. topic:: References:

  - Lawrence R. Rabiner "A tutorial on hidden Markov models and selected
    applications in speech recognition", Proceedings of the IEEE 77.2,
    pp. 257-286, 1989.
  - Jeff A. Bilmes, "A gentle tutorial of the EM algorithm and its application
    to parameter estimation for Gaussian mixture and hidden Markov models.",
    1998.
  - Mark Stamp. "A revealing introduction to hidden Markov models". Tech. rep.
    Department of Computer Science, San Jose State University, 2018.
    url: http://www.cs.sjsu.edu/~stamp/RUA/HMM.pdf.


Available models
----------------

.. autosummary::
   :nosignatures:

   hmm.CategoricalHMM
   hmm.GaussianHMM
   hmm.GMMHMM
   hmm.MultinomialHMM
   hmm.PoissonHMM
   vhmm.VariationalCategoricalHMM
   vhmm.VariationalGaussianHMM

:ref:`Read on <customizing>` for details on how to implement a HMM with a
custom emission probability.

Building HMM and generating samples
-----------------------------------

You can build a HMM instance by passing the parameters described above to the
constructor. Then, you can generate samples from the HMM by calling
:meth:`~.BaseHMM.sample`.

>>> import numpy as np
>>> from hmmlearn import hmm
>>> np.random.seed(42)
>>>
>>> model = hmm.GaussianHMM(n_components=3, covariance_type="full")
>>> model.startprob_ = np.array([0.6, 0.3, 0.1])
>>> model.transmat_ = np.array([[0.7, 0.2, 0.1],
...                             [0.3, 0.5, 0.2],
...                             [0.3, 0.3, 0.4]])
>>> model.means_ = np.array([[0.0, 0.0], [3.0, -3.0], [5.0, 10.0]])
>>> model.covars_ = np.tile(np.identity(2), (3, 1, 1))
>>> X, Z = model.sample(100)

The transition probability matrix need not to be ergodic. For instance, a
left-right HMM can be defined as follows:

>>> lr = hmm.GaussianHMM(n_components=3, covariance_type="diag",
...                      init_params="cm", params="cmt")
>>> lr.startprob_ = np.array([1.0, 0.0, 0.0])
>>> lr.transmat_ = np.array([[0.5, 0.5, 0.0],
...                          [0.0, 0.5, 0.5],
...                          [0.0, 0.0, 1.0]])

If any of the required parameters are missing, :meth:`~.BaseHMM.sample`
will raise an exception:

>>> model = hmm.GaussianHMM(n_components=3)
>>> X, Z = model.sample(100)
Traceback (most recent call last):
    ...
sklearn.exceptions.NotFittedError: This GaussianHMM instance is not fitted yet. Call 'fit' with appropriate arguments before using this estimator.

Fixing parameters
-----------------

Each HMM parameter has a character code which can be used to customize its
initialization and estimation.  The EM algorithm needs a starting point to
proceed, thus prior to training each parameter is assigned a value
either random or computed from the data. It is possible to hook into this
process and provide a starting point explicitly. To do so

1. ensure that the character code for the parameter is missing from
   :attr:`~.BaseHMM.init_params` and then
2. set the parameter to the desired value.

For example, consider a HMM with an explicitly initialized transition
probability matrix:

>>> model = hmm.GaussianHMM(n_components=3, n_iter=100, init_params="mcs")
>>> model.transmat_ = np.array([[0.7, 0.2, 0.1],
...                             [0.3, 0.5, 0.2],
...                             [0.3, 0.3, 0.4]])

A similar trick applies to parameter estimation. If you want to fix some
parameter at a specific value, remove the corresponding character from
:attr:`~.BaseHMM.params` and set the parameter value before training.

.. topic:: Examples:

 * :doc:`/auto_examples/plot_hmm_sampling_and_decoding`

Training HMM parameters and inferring the hidden states
-------------------------------------------------------

You can train an HMM by calling the :meth:`~.BaseHMM.fit` method. The input
is a matrix of concatenated sequences of observations (*aka* samples) along with
the lengths of the sequences (see :ref:`Working with multiple sequences <multiple_sequences>`).

Note, since the EM algorithm is a gradient-based optimization method, it will
generally get stuck in local optima. You should in general try to run ``fit``
with various initializations and select the highest scored model.

The score of the model can be calculated by the :meth:`~.BaseHMM.score` method.

The inferred optimal hidden states can be obtained by calling
:meth:`~.BaseHMM.predict` method. The ``predict`` method can be
specified with a decoder algorithm. Currently the Viterbi algorithm
(``"viterbi"``), and maximum a posteriori estimation (``"map"``) are supported.

This time, the input is a single sequence of observed values.  Note, the states
in ``remodel`` will have a different order than those in the generating model.

>>> remodel = hmm.GaussianHMM(n_components=3, covariance_type="full", n_iter=100)
>>> remodel.fit(X)
GaussianHMM(...
>>> Z2 = remodel.predict(X)

.. topic:: Examples:

 * :doc:`/auto_examples/plot_casino`

Monitoring convergence
----------------------

The number of EM algorithm iterations is upper bounded by the ``n_iter``
parameter. The training proceeds until ``n_iter`` steps were performed or the
change in score is lower than the specified threshold ``tol``. Note, that
depending on the data, the EM algorithm may or may not achieve convergence in the
given number of steps.

You can use the :attr:`~.BaseHMM.monitor_` attribute to diagnose convergence:

>>> remodel.monitor_
ConvergenceMonitor(
    history=[...],
    iter=...,
    n_iter=100,
    tol=0.01,
    verbose=False,
)
>>> remodel.monitor_.converged
True

.. _multiple_sequences:

Working with multiple sequences
-------------------------------

All of the examples so far were using a single sequence of observations.
The input format in the case of multiple sequences is a bit involved and
is best understood by example.

Consider two 1D sequences:

>>> X1 = [[0.5], [1.0], [-1.0], [0.42], [0.24]]
>>> X2 = [[2.4], [4.2], [0.5], [-0.24]]

To pass both sequences to :meth:`~.BaseHMM.fit` or
:meth:`~.BaseHMM.predict`, first concatenate them into a single array and
then compute an array of sequence lengths:

>>> X = np.concatenate([X1, X2])
>>> lengths = [len(X1), len(X2)]

Finally, just call the desired method with ``X`` and ``lengths``:

>>> hmm.GaussianHMM(n_components=3).fit(X, lengths)
GaussianHMM(...

Saving and loading HMM
----------------------

After training, a HMM can be easily persisted for future use with the standard
:mod:`pickle` module:

>>> import pickle
>>> with open("filename.pkl", "wb") as file: pickle.dump(remodel, file)
>>> with open("filename.pkl", "rb") as file: pickle.load(file)
GaussianHMM(...

.. _customizing:

Implementing HMMs with custom emission probabilities
----------------------------------------------------

If you want to implement a custom emission probability (e.g. Cauchy), you have to
subclass :class:`~.BaseHMM` and override the following methods

.. currentmodule:: hmmlearn.base

.. autosummary::

   BaseHMM._init
   BaseHMM._check
   BaseHMM._generate_sample_from_state
   BaseHMM._compute_log_likelihood
   BaseHMM._compute_likelihood
   BaseHMM._initialize_sufficient_statistics
   BaseHMM._accumulate_sufficient_statistics
   BaseHMM._do_mstep

Optionally, only one of `~.BaseHMM._compute_likelihood` and
`~.BaseHMM._compute_log_likelihood` need to be overridden, and the
base implementation will provide the other.