File: pca.rst

package info (click to toggle)
python-bayespy 0.6.2-1
  • links: PTS, VCS
  • area: main
  • in suites: forky, sid, trixie
  • size: 2,132 kB
  • sloc: python: 22,402; makefile: 156
file content (215 lines) | stat: -rw-r--r-- 6,089 bytes parent folder | download | duplicates (3)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
..
   Copyright (C) 2014 Jaakko Luttinen

   This file is licensed under the MIT License. See LICENSE for a text of the
   license.


.. testsetup::

   import numpy
   numpy.random.seed(1)


Principal component analysis
============================

This example uses a simple principal component analysis to find a
two-dimensional latent subspace in a higher dimensional dataset.

Data
----

Let us create a Gaussian dataset with latent space dimensionality two and some
observation noise:

>>> M = 20
>>> N = 100

>>> import numpy as np
>>> x = np.random.randn(N, 2)
>>> w = np.random.randn(M, 2)
>>> f = np.einsum('ik,jk->ij', w, x)
>>> y = f + 0.1*np.random.randn(M, N)

Model
-----

We will use 10-dimensional latent space in our model and let it learn the true
dimensionality:

>>> D = 10

Import relevant nodes:

>>> from bayespy.nodes import GaussianARD, Gamma, SumMultiply

The latent states:

>>> X = GaussianARD(0, 1, plates=(1,N), shape=(D,))

The loading matrix with automatic relevance determination (ARD) prior:

>>> alpha = Gamma(1e-5, 1e-5, plates=(D,))
>>> C = GaussianARD(0, alpha, plates=(M,1), shape=(D,))

Compute the dot product of the latent states and the loading matrix:

>>> F = SumMultiply('d,d->', X, C)

The observation noise:

>>> tau = Gamma(1e-5, 1e-5)

The observed variable:

>>> Y = GaussianARD(F, tau)

Inference
---------

Observe the data:

>>> Y.observe(y)

We do not have missing data now, but they could be easily handled with ``mask``
keyword argument.  Construct variational Bayesian (VB) inference engine:

>>> from bayespy.inference import VB
>>> Q = VB(Y, X, C, alpha, tau)

Initialize the latent subspace randomly, otherwise both ``X`` and ``C`` would
converge to zero:

>>> C.initialize_from_random()

.. currentmodule:: bayespy.inference

Now we could use :func:`VB.update` to run the inference.  However, let us first
create a parameter expansion to speed up the inference.  The expansion is based
on rotating the latent subspace optimally.  This is optional but will usually
improve the speed of the inference significantly, especially in high-dimensional
problems:

>>> from bayespy.inference.vmp.transformations import RotateGaussianARD
>>> rot_X = RotateGaussianARD(X)
>>> rot_C = RotateGaussianARD(C, alpha)

By giving ``alpha`` for ``rot_C``, the rotation will also optimize ``alpha``
jointly with ``C``.  Now that we have defined the rotations for our variables,
we need to construct an optimizer:

>>> from bayespy.inference.vmp.transformations import RotationOptimizer
>>> R = RotationOptimizer(rot_X, rot_C, D)

In order to use the rotations automatically, we need to set it as a callback
function:

>>> Q.set_callback(R.rotate)

For more information about the rotation parameter expansion, see
:cite:`Luttinen:2010` and :cite:`Luttinen:2013`.  Now we can run the actual
inference until convergence:

>>> Q.update(repeat=1000)
Iteration 1: loglike=-2.33...e+03 (... seconds)
...
Iteration ...: loglike=6.500...e+02 (... seconds)
Converged at iteration ...



Results
-------

The results can be visualized, for instance, by plotting the Hinton diagram of
the loading matrix:

>>> import bayespy.plot as bpplt
>>> bpplt.pyplot.figure()
<matplotlib.figure.Figure object at 0x...>
>>> bpplt.hinton(C)

.. plot::

   import numpy
   numpy.random.seed(1)
   M = 20
   N = 100
   import numpy as np
   x = np.random.randn(N, 2)
   w = np.random.randn(M, 2)
   f = np.einsum('ik,jk->ij', w, x)
   y = f + 0.1*np.random.randn(M, N)
   D = 10
   from bayespy.nodes import GaussianARD, Gamma, SumMultiply
   X = GaussianARD(0, 1, plates=(1,N), shape=(D,))
   alpha = Gamma(1e-5, 1e-5, plates=(D,))
   C = GaussianARD(0, alpha, plates=(M,1), shape=(D,))
   F = SumMultiply('d,d->', X, C)
   tau = Gamma(1e-5, 1e-5)
   Y = GaussianARD(F, tau)
   Y.observe(y)
   from bayespy.inference import VB
   Q = VB(Y, X, C, alpha, tau)
   C.initialize_from_random()
   from bayespy.inference.vmp.transformations import RotateGaussianARD
   rot_X = RotateGaussianARD(X)
   rot_C = RotateGaussianARD(C, alpha)
   from bayespy.inference.vmp.transformations import RotationOptimizer
   R = RotationOptimizer(rot_X, rot_C, D)
   Q.set_callback(R.rotate)
   Q.update(repeat=1000)
   import bayespy.plot as bpplt
   bpplt.hinton(C)

The method has been able to prune out unnecessary latent dimensions and keep two
components, which is the true number of components.

>>> bpplt.pyplot.figure()
<matplotlib.figure.Figure object at 0x...>
>>> bpplt.plot(F)
>>> bpplt.plot(f, color='r', marker='x', linestyle='None')

.. plot::

   import numpy
   numpy.random.seed(1)
   M = 20
   N = 100
   import numpy as np
   x = np.random.randn(N, 2)
   w = np.random.randn(M, 2)
   f = np.einsum('ik,jk->ij', w, x)
   y = f + 0.1*np.random.randn(M, N)
   D = 10
   from bayespy.nodes import GaussianARD, Gamma, SumMultiply
   X = GaussianARD(0, 1, plates=(1,N), shape=(D,))
   alpha = Gamma(1e-5, 1e-5, plates=(D,))
   C = GaussianARD(0, alpha, plates=(M,1), shape=(D,))
   F = SumMultiply('d,d->', X, C)
   tau = Gamma(1e-5, 1e-5)
   Y = GaussianARD(F, tau)
   Y.observe(y)
   from bayespy.inference import VB
   Q = VB(Y, X, C, alpha, tau)
   C.initialize_from_random()
   from bayespy.inference.vmp.transformations import RotateGaussianARD
   rot_X = RotateGaussianARD(X)
   rot_C = RotateGaussianARD(C, alpha)
   from bayespy.inference.vmp.transformations import RotationOptimizer
   R = RotationOptimizer(rot_X, rot_C, D)
   Q.set_callback(R.rotate)
   Q.update(repeat=1000)
   import bayespy.plot as bpplt
   bpplt.plot(F)
   bpplt.plot(f, color='r', marker='x', linestyle='None')

.. currentmodule:: bayespy.nodes

The reconstruction of the noiseless function values are practically perfect in
this simple example.  Larger noise variance, more latent space dimensions and
missing values would make this problem more difficult.  The model construction
could also be improved by having, for instance, ``C`` and ``tau`` in the same
node without factorizing between them in the posterior approximation.  This can
be achieved by using :class:`GaussianGammaISO` node.