File: NeuralNetworkWrapper.py

package info (click to toggle)
python-pomegranate 0.15.0-2
  • links: PTS, VCS
  • area: main
  • in suites: forky, sid
  • size: 36,948 kB
  • sloc: python: 11,489; makefile: 259; sh: 28
file content (147 lines) | stat: -rw-r--r-- 5,534 bytes parent folder | download | duplicates (3)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
# NeuralNetworkWrapper.py
# Contact: Jacob Schreiber <jmschreiber91@gmail.com>

import numpy

class NeuralNetworkWrapper():
    '''A wrapper for a neural network model for use in pomegranate.
    
    This wrapper will store a pointer to the model, as well as an indicator
    of what class it represents. It needs information about the number of
    dimensions of the input and the total number of classes in the input.
    It is currently built to work with keras models, but can be easily
    modified to work with the package of your choice.
    
    Training works in a somewhat hacky manner. Internally, pomegranate
    will scan over all components of the model, calling `summarize` for each
    one, and then `from_summaries` at the end. During the EM procedure, the
    samples and their associated weights are passed in to the `summarize`
    function. The associated weights are the responsibilities calculated
    during the EM algorithm. In theory, one could simply update the model
    using the samples and their corresponding weights. In practice, it's much
    better to reconstruct the whole responsibility matrix for the batch of
    data and then train on soft labels.
    
    The process for training is as follows. When pomegranate starts a round of
    optimization, this wrapper will store a pointer to the data set to the
    neural network model object. This data set is the same one passed to each
    NeuralNetworkWrapper, the only difference being the ~weights~. Thus, each
    successive call will store the weights that are passed in (the responsibilities
    from the EM algorithm) to an associated label matrix. The result is a single
    copy of the data batch and a corresponding matrix of soft labels. Keras
    allows us to train a classifier on soft labels, and this is the preferred
    strategy.
    
    Parameters
    ----------
    model : object
        The neural network model being utilized.
        
    i : int
        The class that this distribution represents.
    
    n_dimensions : int
        The number of dimensions in the input.
    
    n_classes : int
        The total number of classes that the model can output.
    '''
    
    def __init__(self, model, i, n_dimensions, n_classes):
        self.d = n_dimensions
        self.n_classes = n_classes
        self.model = model
        self.i = i
        self.model.X = []
        self.model.y = []
        self.model.w = []
    
    def log_probability(self, X):
        ''' Return pseudo-log probabilities from the neural network.
        
        This method returns the log probability of the class that this
        wrapper represents given the model. Thus, it's not strictly a
        likelihood function, but rather, a posterior. However, because
        the HMM takes log probabilities, multiplies them with a prior,
        and then normalizes them, mathematically they work out to be
        equivalent.

        This method uses the `predict` function from the neural network,
        which should take in a single batch of data, and returns the
        posterior probability of each class given the network. Typically,
        this is calculated using a softmax function on the outputs. The
        output of this function should be a matrix of size (n, k), where
        n is the number of samples and k is the number of classes, where
        the sum over each sample is equal to 1. 

        Parameters
        ----------
        X : numpy.ndarray, shape=(n, d)
            The batch of data to calculate probabilities over.
        '''
        
        return numpy.log(self.model.predict(X)[:,self.i])
    
    def summarize(self, X, w):
        '''When shown a batch of data, store the data.

        This will store the batch of data, and associated weights, to the
        object. The actual update occurs when `from_summaries` is called.

        Parameters
        ----------
        X : numpy.ndarray, shape=(n, d)
            The batch of data to be passed in.

        w : numpy.ndarray, shape=(n,)
            The associated weights. These can be uniform if unweighted.

        Returns
        -------
        None
        '''

        if self.i == 0:
            self.model.X = X.copy()
            self.model.y = numpy.zeros((X.shape[0], self.n_classes))
            
        self.model.y[:, self.i] = w
        
    def from_summaries(self, inertia=0.0):
        '''Perform a single gradient update to the network.

        This will perform a single gradient update step, using the
        `train_on_batch` method from the network. This is already implemented
        for keras networks, but for other networks this method will have to
        be implemented. It should take in a single batch of data, along with
        associated sample weights, and update the model weights.

        Parameters
        ----------
        inertia : double, optional
            This parameter is ignored for neural networks, but required for
            compatibility reasons.

        Returns
        -------
        None
        '''


        if self.i == 0:
            self.model.train_on_batch(self.model.X, self.model.y)
            self.clear_summaries()
    
    def clear_summaries(self):
        self.model.X = None
        self.model.y = None


    @classmethod
    def from_samples(self, X, weights):
        '''The training of this wrapper on data should be performed in the main model.

        This method should not be used directly to train the network.
        '''

        return self