1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147
|
# NeuralNetworkWrapper.py
# Contact: Jacob Schreiber <jmschreiber91@gmail.com>
import numpy
class NeuralNetworkWrapper():
'''A wrapper for a neural network model for use in pomegranate.
This wrapper will store a pointer to the model, as well as an indicator
of what class it represents. It needs information about the number of
dimensions of the input and the total number of classes in the input.
It is currently built to work with keras models, but can be easily
modified to work with the package of your choice.
Training works in a somewhat hacky manner. Internally, pomegranate
will scan over all components of the model, calling `summarize` for each
one, and then `from_summaries` at the end. During the EM procedure, the
samples and their associated weights are passed in to the `summarize`
function. The associated weights are the responsibilities calculated
during the EM algorithm. In theory, one could simply update the model
using the samples and their corresponding weights. In practice, it's much
better to reconstruct the whole responsibility matrix for the batch of
data and then train on soft labels.
The process for training is as follows. When pomegranate starts a round of
optimization, this wrapper will store a pointer to the data set to the
neural network model object. This data set is the same one passed to each
NeuralNetworkWrapper, the only difference being the ~weights~. Thus, each
successive call will store the weights that are passed in (the responsibilities
from the EM algorithm) to an associated label matrix. The result is a single
copy of the data batch and a corresponding matrix of soft labels. Keras
allows us to train a classifier on soft labels, and this is the preferred
strategy.
Parameters
----------
model : object
The neural network model being utilized.
i : int
The class that this distribution represents.
n_dimensions : int
The number of dimensions in the input.
n_classes : int
The total number of classes that the model can output.
'''
def __init__(self, model, i, n_dimensions, n_classes):
self.d = n_dimensions
self.n_classes = n_classes
self.model = model
self.i = i
self.model.X = []
self.model.y = []
self.model.w = []
def log_probability(self, X):
''' Return pseudo-log probabilities from the neural network.
This method returns the log probability of the class that this
wrapper represents given the model. Thus, it's not strictly a
likelihood function, but rather, a posterior. However, because
the HMM takes log probabilities, multiplies them with a prior,
and then normalizes them, mathematically they work out to be
equivalent.
This method uses the `predict` function from the neural network,
which should take in a single batch of data, and returns the
posterior probability of each class given the network. Typically,
this is calculated using a softmax function on the outputs. The
output of this function should be a matrix of size (n, k), where
n is the number of samples and k is the number of classes, where
the sum over each sample is equal to 1.
Parameters
----------
X : numpy.ndarray, shape=(n, d)
The batch of data to calculate probabilities over.
'''
return numpy.log(self.model.predict(X)[:,self.i])
def summarize(self, X, w):
'''When shown a batch of data, store the data.
This will store the batch of data, and associated weights, to the
object. The actual update occurs when `from_summaries` is called.
Parameters
----------
X : numpy.ndarray, shape=(n, d)
The batch of data to be passed in.
w : numpy.ndarray, shape=(n,)
The associated weights. These can be uniform if unweighted.
Returns
-------
None
'''
if self.i == 0:
self.model.X = X.copy()
self.model.y = numpy.zeros((X.shape[0], self.n_classes))
self.model.y[:, self.i] = w
def from_summaries(self, inertia=0.0):
'''Perform a single gradient update to the network.
This will perform a single gradient update step, using the
`train_on_batch` method from the network. This is already implemented
for keras networks, but for other networks this method will have to
be implemented. It should take in a single batch of data, along with
associated sample weights, and update the model weights.
Parameters
----------
inertia : double, optional
This parameter is ignored for neural networks, but required for
compatibility reasons.
Returns
-------
None
'''
if self.i == 0:
self.model.train_on_batch(self.model.X, self.model.y)
self.clear_summaries()
def clear_summaries(self):
self.model.X = None
self.model.y = None
@classmethod
def from_samples(self, X, weights):
'''The training of this wrapper on data should be performed in the main model.
This method should not be used directly to train the network.
'''
return self
|