File: perceptron.md

package info (click to toggle)
mlpack 4.7.0-2
  • links: PTS, VCS
  • area: main
  • in suites: sid
  • size: 32,064 kB
  • sloc: cpp: 233,202; python: 1,940; sh: 1,201; lisp: 414; makefile: 85
file content (412 lines) | stat: -rw-r--r-- 15,244 bytes parent folder | download | duplicates (2)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
## `Perceptron`

The `Perceptron` class implements the simple perceptron classifier originally
implemented by Frank Rosenblatt in 1958.  The perceptron is a linear classifier,
and can be understood as a trivial neural network with one neuron that uses the
step function as an activation function.  mlpack's implementation of the
`Perceptron` class also offers several template parameters that can be used to
control the behavior of the perceptron.

Perceptrons are useful for classifying points with _discrete labels_ (i.e., `0`,
`1`, `2`).  Because they are simple classifiers, they are also useful as _weak
learners_ for the [`AdaBoost`](adaboost.md) boosting classifier.

#### Simple usage example:

```c++
// Train a perceptron on random numeric data and predict labels on test data:

// All data and labels are uniform random; 10 dimensional data, 5 classes.
// Replace with a Load() call or similar for a real application.
arma::mat dataset(10, 1000, arma::fill::randu);
arma::Row<size_t> labels =
    arma::randi<arma::Row<size_t>>(1000, arma::distr_param(0, 4));
arma::mat testDataset(10, 500, arma::fill::randu); // 500 test points.

mlpack::Perceptron p;                 // Step 1: create model.
p.Train(dataset, labels, 5);          // Step 2: train model.
arma::Row<size_t> predictions;
p.Classify(testDataset, predictions); // Step 3: classify points.

// Print some information about the test predictions.
std::cout << arma::accu(predictions == 1) << " test points classified as class "
    << "1." << std::endl;
```
<p style="text-align: center; font-size: 85%"><a href="#simple-examples">More examples...</a></p>

#### Quick links:

 * [Constructors](#constructors): create `DecisionTree` objects.
 * [`Train()`](#training): train model.
 * [`Classify()`](#classification): classify with a trained model.
 * [Other functionality](#other-functionality) for loading, saving, and
   inspecting.
 * [Examples](#simple-examples) of simple usage and links to detailed example
   projects.
 * [Template parameters](#advanced-functionality-template-parameters) for custom
   behavior.
 * [Advanced template examples](#advanced-functionality-examples) of use with
   custom template parameters.

#### See also:

 * [`NaiveBayesClassifier`](naive_bayes_classifier.md), another simple classifier
 * [`AdaBoost`](adaboost.md)
 * [`FFN`](/src/mlpack/methods/ann/ffn.hpp)
 * [mlpack classifiers](../modeling.md#classification)
 * [Perceptron on Wikipedia](https://en.wikipedia.org/wiki/Perceptron)

### Constructors

Construct a `Perceptron` object using one of the constructors below.  Defaults
and types are detailed in the [Constructor Parameters](#constructor-parameters)
section below.

#### Forms:

 * `p = Perceptron()`
   - Initialize perceptron without training.
   - You will need to call [`Train()`](#training) later to train the perceptron
     before calling [`Classify()`](#classification).

---

 * `p = Perceptron(numClasses, dimensionality, maxIterations=1000)`
   - Initialize perceptron with all-zero weights and biases.
   - `Classify()` can immediately be used; training is not required with this
     form.

---

 * `p = Perceptron(data, labels, numClasses,          maxIterations=1000)`
 * `p = Perceptron(data, labels, numClasses, weights, maxIterations=1000)`
   - Train the perceptron (optionally with instance weights).

---

#### Constructor Parameters:

| **name** | **type** | **description** | **default** |
|----------|----------|-----------------|-------------|
| `data` | [`arma::mat`](../matrices.md) | [Column-major](../matrices.md#representing-data-in-mlpack) training matrix. | _(N/A)_ |
| `datasetInfo` | [`DatasetInfo`](../load_save.md#mixed-categorical-data) | Dataset information, specifying type information for each dimension. | _(N/A)_ |
| `labels` | [`arma::Row<size_t>`](../matrices.md) | Training labels, between [`0` and `numClasses - 1`](../core/normalizing_labels.md) (inclusive).  Should have length `data.n_cols`.  | _(N/A)_ |
| `weights` | [`arma::rowvec`](../matrices.md) | Weights for each training point.  Should have length `data.n_cols`.  | _(N/A)_ |
| `numClasses` | `size_t` | Number of classes in the dataset. | _(N/A)_ |
| `dimensionality` | `size_t` | Dimensionality of data (only used if an initialized but untrained model is desired). | _(N/A)_ |
| `maxIterations` | `size_t` | Maximum number of iterations during training.  Can also be set with `MaxIterations()`. | `1000` |

As an alternative to passing `maxIterations`, it can be set with a standalone
method.  The following function can be used before calling `Train()` to set
the maximum number of iterations:

 * `p.MaxIterations() = maxIter;` will set the maximum number of iterations
   during training to `maxIter`.

### Training

If training is not done as part of the constructor call, it can be done with one
of the following versions of the `Train()` member function:

 * `p.Train(data, labels, numClasses, maxIterations=1000)`
   - Train the perceptron on unweighted data.

---

 * `p.Train(data, labels, numClasses, weights, maxIterations=1000)`
   - Train the perceptron on data with instance weights.

---

Types of each argument are the same as in the table for constructors
[above](#constructor-parameters).

***Notes***:

 * Training is incremental.  Successive calls to `Train()` will not reinitialize
   the model, unless the given data has different dimensionality or `numClasses`
   is different.  To reinitialize the model, call `Reset()` (see
   [Other Functionality](#other-functionality)).

 * If `maxIterations` is not passed, but has been set in the constructor or with
   `MaxIterations()`, the previous setting will be used.

### Classification

Once a `Perceptron` is trained, the `Classify()` member function can be used to
make class predictions for new data.

 * `size_t predictedClass = p.Classify(point)`
    - ***(Single-point)***
    - Classify a single point, returning the predicted class.

---

 * `p.Classify(data, predictions)`
    - ***(Multi-point)***
    - Classify a set of points.
    - The prediction for data point `i` can be accessed with `predictions[i]`.

---

***Note***: perceptrons do not provide any measure resembling probabilities
during classification, and thus a version of `Classify()` that computes class
probabilities is not available.

#### Classification Parameters:

| **usage** | **name** | **type** | **description** |
|-----------|----------|----------|-----------------|
| _single-point_ | `point` | [`arma::vec`](../matrices.md) | Single point for classification. |
||||
| _multi-point_ | `data` | [`arma::mat`](../matrices.md) | Set of [column-major](../matrices.md#representing-data-in-mlpack) points for classification. |
| _multi-point_ | `predictions` | [`arma::Row<size_t>&`](../matrices.md) | Vector of `size_t`s to store class prediction into.  Will be set to length `data.n_cols`. |

### Other Functionality

 * A `Perceptron` can be serialized with
   [`Save()` and `Load()`](../load_save.md#mlpack-models-and-objects).

 * `p.NumClasses()` will return a `size_t` indicating the number of classes the
   perceptron was trained on.

 * `p.Biases()` will return an `arma::vec` with the biases of the model (each
   element corresponds to the bias for a class).

 * `p.Weights()` will return an `arma::mat` with the weights of the model (each
   column corresponds to the weights for one class label).

 * `p.Reset()` will re-initialize the weights and biases of the model.

For complete functionality, the [source
code](/src/mlpack/methods/perceptron/perceptron.hpp) can be consulted.  Each
method is fully documented.

### Simple Examples

See also the [simple usage example](#simple-usage-example) for a trivial use of
`Perceptron`.

---

Train a perceptron multiple times, incrementally, with custom hyperparameters,
and save the resulting model to disk.

```c++
// See https://datasets.mlpack.org/iris.csv.
arma::mat dataset;
mlpack::Load("iris.csv", dataset, mlpack::Fatal);
// See https://datasets.mlpack.org/iris.labels.csv.
arma::Row<size_t> labels;
mlpack::Load("iris.labels.csv", labels, mlpack::Fatal);

// Create a Perceptron object.
mlpack::Perceptron p;
// Set the maximum number of iterations to 100.  (This can also be done in the
// constructor.)
p.MaxIterations() = 100;

// Train the model for up to 100 iterations.
p.Train(dataset, labels, 3);

// Now, compute and print accuracy on the training set.
arma::Row<size_t> predictions;
p.Classify(dataset, predictions);
std::cout << "Training set accuracy after 100 iterations: "
    << (100.0 * double(arma::accu(labels == predictions)) / labels.n_elem)
    << "\%." << std::endl;

// Train for another 250 iterations and compute training set accuracy again.
p.Train(dataset, labels, 3, 250);
p.Classify(dataset, predictions);
std::cout << "Training set accuracy after 350 iterations: "
    << (100.0 * double(arma::accu(labels == predictions)) / labels.n_elem)
    << "\%." << std::endl;

// Save the perceptron to disk for later use.
mlpack::Save("perceptron.bin", p);
```

---

Load a saved perceptron from disk and print information about it.

```c++
mlpack::Perceptron p;
// This call assumes a perceptron has already been saved to `perceptron.bin`
// with `Save()`.
mlpack::Load("perceptron.bin", p, mlpack::Fatal);

if (p.NumClasses() > 0)
{
  std::cout << "The perceptron in `perceptron.bin` was trained on "
      << p.NumClasses() << " classes." << std::endl;
  std::cout << "The dimensionality of the perceptron model is "
      << p.Weights().n_rows << "." << std::endl;
  std::cout << "The bias weights for each class are:" << std::endl;
  for (size_t i = 0; i < p.NumClasses(); ++i)
    std::cout << "  - Class " << i << ": " << p.Biases()[i] << std::endl;
}
else
{
  std::cout << "The perceptron in `perceptron.bin` has not been trained."
      << std::endl;
}
```

---

### Advanced Functionality: Template Parameters

The `Perceptron` class also supports several template parameters, which can be
used for custom behavior.  The full signature of the class is as follows:

```
Perceptron<LearnPolicy,
           WeightInitializationPolicy,
           MatType>
```

 * `LearnPolicy`: the strategy used to learn the weights during training.
 * `WeightInitializationPolicy`: the way that weights are initialized before
   training.
 * `MatType`: specifies the type of matrix used for learning and internal
   representation of weights and biases.

---

#### `LearnPolicy`

 * Specifies the step to be taken when a point is misclassified.
 * The `SimpleWeightUpdate` class is available, and is the default.
 * A custom class must implement only one function:

```c++
// You can use this as a starting point for implementation.
class CustomLearnPolicy
{
  // Update the weights and biases in the `weights` matrix and the `biases`
  // vector given that the model currently classified `trainingPoint` as having
  // the label `incorrectClass`, when in reality it has the label
  // `correctClass`.  If `instanceWeight` is given, it specifies the instance
  // weight for the given `trainingPoint`.
  //
  // `VecType` will be an Armadillo-like vector type.  It will be a column from
  // the training data matrix (`data`) given to `Train()` or to the constructor.
  //
  // `eT` is the element type of the Perceptron (e.g. `float`, `double`).
  template<typename VecType, typename eT>
  void UpdateWeights(const VecType& trainingPoint,
                     arma::Mat<eT>& weights,
                     arma::Col<eT>& biases,
                     const size_t incorrectClass,
                     const size_t correctClass,
                     const double instanceWeight = 1.0);
};
```

---

#### `WeightInitializationPolicy`

 * Specifies how the weights matrix and biases vector should be initialized when
   the `Perceptron` object is created, or when `Reset()` is called.
 * The `ZeroInitialization` _(default)_ and `RandomPerceptronInitialization`
   classes are available for drop-in usage.
 * `RandomPerceptronInitialization` will initialize weights and biases using a
   uniform random distribution between 0 and 1.
 * A custom class must implement only one function:

```c++
// You can use this as a starting point for implementation.
class CustomWeightInitializationPolicy
{
  // Initialize the `weights` matrix and `biases` vector, given that the model
  // will have dimensionality of `numFeatures` (that is, the training data
  // matrix will have `numFeatures` rows), and the training data has
  // `numClasses` classes.
  //
  // The initialized `weights` matrix should have `numFeatures` rows and
  // `numClasses` columns, and the initialized `biases` vector should have
  // `numClasses` elements.
  //
  // `eT` specifies the element type of the weights and biases; it may be
  // `double`, `float`, or another floating-point type.
  template<typename eT>
  inline static void Initialize(arma::Mat<eT>& weights,
                                arma::Col<eT>& biases,
                                const size_t numFeatures,
                                const size_t numClasses)
  {
    weights.randu(numFeatures, numClasses);
    biases.randu(numClasses);
  }
};
```

---

#### `MatType`

 * Specifies the matrix type to use for data when learning a perceptron.
 * By default, `MatType` is `arma::mat` (dense 64-bit precision matrix).
 * Any matrix type implementing the Armadillo API will work; so, for instance,
   `arma::fmat` or `arma::sp_mat` can be used.

### Advanced Functionality Examples

Train a `Perceptron` with random initialization, instead of zero initialization
of weights.

```c++
// 1000 random points in 10 dimensions.
arma::mat dataset(10, 1000, arma::fill::randu);
// Random labels for each point, totaling 5 classes.
arma::Row<size_t> labels =
    arma::randi<arma::Row<size_t>>(1000, arma::distr_param(0, 4));

// Train in the constructor.  Weights will be initialized randomly.
mlpack::Perceptron<mlpack::SimpleWeightUpdate,
                   mlpack::RandomPerceptronInitialization> p(
    dataset, labels, 5);

// Create test data (500 points).
arma::mat testDataset(10, 500, arma::fill::randu);
arma::Row<size_t> predictions;
p.Classify(testDataset, predictions);
// Now `predictions` holds predictions for the test dataset.

// Print some information about the test predictions.
std::cout << arma::accu(predictions == 1) << " test points classified as class "
    << "1." << std::endl;
```

---

Train a `Perceptron` on sparse 32-bit floating point data.

```c++

// 1000 sparse random points in 100 dimensions, with 1% nonzero elements.
arma::sp_fmat dataset;
dataset.sprandu(100, 1000, 0.01);
// Random labels for each point, totaling 5 classes.
arma::Row<size_t> labels =
    arma::randi<arma::Row<size_t>>(1000, arma::distr_param(0, 4));

// Train in the constructor.
mlpack::Perceptron p(dataset, labels, 5);

// Create test data (500 points).
arma::sp_fmat testDataset;
testDataset.sprandu(100, 500, 0.01);
arma::Row<size_t> predictions;
p.Classify(testDataset, predictions);
// Now `predictions` holds predictions for the test dataset.

// Print some information about the test predictions.
std::cout << arma::accu(predictions == 1) << " test points classified as class "
    << "1." << std::endl;
```

---