1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 598 599 600 601 602 603 604 605 606 607 608 609 610 611 612 613 614 615 616 617 618 619 620 621 622 623 624 625 626 627 628 629 630 631 632 633 634 635 636 637 638 639 640 641 642 643 644 645 646 647 648 649 650 651 652 653 654 655 656 657 658 659 660 661 662 663 664 665 666 667 668 669 670 671 672 673 674 675 676 677 678 679 680 681 682 683 684 685 686 687 688 689 690 691 692 693 694 695 696 697 698 699 700 701 702 703 704 705 706 707 708 709 710 711 712 713 714 715 716 717 718 719 720 721 722 723 724 725 726 727 728 729 730 731 732 733 734 735 736 737 738 739 740 741 742 743 744 745 746 747 748 749 750 751 752 753 754 755 756 757 758 759 760 761 762 763 764 765 766 767 768 769 770 771 772 773 774 775 776 777 778 779 780 781 782 783 784 785 786 787 788 789 790 791 792 793 794 795 796 797 798 799 800 801 802
|
# Core math utilities
mlpack provides a number of mathematical utility classes and functions on top of
Armadillo.
* [Aliases](#aliases): utilities to create and manage aliases (`MakeAlias()`,
`ClearAlias()`, `UnwrapAlias()`).
* [`Range`](#range): simple mathematical range (i.e. `[0, 3]`)
* [`ColumnCovariance()`](#columncovariance): compute covariance of
[column-major](../matrices.md#representing-data-in-mlpack) data
* [`ColumnsToBlocks`](#columnstoblocks): reshape data points into a block
matrix for visualization (useful for images)
* [Distribution utilities](#distribution-utilities): `Digamma()`, `Trigamma()`
* [`RandVector()`](#randvector): generate random vector on the unit sphere
using the Box-Muller transform
* [Logarithmic utilities](#logarithmic-utilities): `LogAdd()`, `AccuLog()`,
`LogSumExp()`, `LogSumExpT()`.
* [`MultiplyCube2Cube()`](#multiplycube2cube): multiply each slice in a cube by each slice in another cube
* [`MultiplyMat2Cube()`](#multiplymat2cube): multiply a matrix by each slice in a cube
* [`MultiplyCube2Mat()`](#multiplycube2mat): multiply each slice in a cube by a matrix
* [`Quantile()`](#quantile): compute the quantile function of the Gaussian
distribution
* [RNG and random number utilities](#rng-and-random-number-utilities): extended
scalar random number generation functions
* [`RandomBasis()`](#randombasis): generate a random orthogonal basis
* [`ShuffleData()`](#shuffledata): shuffle a dataset and associated labels
---
## Aliases
Aliases are matrix, vector, or cube objects that share memory with another
matrix, vector, or cube. They are often used internally inside of mlpack to
avoid copies.
***Important caveats about aliases***:
- An alias represents the same memory block as the input. As such, changes to
the alias object will also be reflected in the original object.
- The `MakeAlias()` function is not guaranteed to return an alias; it only
returns an alias *if possible*, and makes a copy otherwise.
- If `mat` goes out of scope or is destructed, then `a` ***becomes invalid***.
_You are responsible for ensuring an invalid alias is not used!_
---
* `MakeAlias(a, vector, rows, cols, offset=0, strict=true)`
- Make `a` into an alias of `vector` with the given size.
- If `offset` is `0`, then the alias is identical: the first element of
`a` is the first element of `vector`. Otherwise, the first element of `a`
is the `offset`'th element of `vector`.
- If `strict` is `true`, the size of `a` cannot be changed.
- `vector` and `a` should have the same vector type (e.g. `arma::vec`,
`arma::fvec`).
- If an alias cannot be created, the vector will be copied.
* `MakeAlias(a, mat, rows, cols, offset=0, strict=true)`
- Make `a` into an alias of `mat` with the given size.
- If `offset` is `0`, then the alias is identical: the first element of
`a` is the first element of `mat`. Otherwise, the first element of `a`
is the `offset`'th element of `mat`; elements in `mat` are ordered in
a [column-major way](../matrices.md#representing-data-in-mlpack).
- If `strict` is `true`, the size of `a` cannot be changed.
- `mat` and `a` should have the same matrix type (e.g. `arma::mat`,
`arma::fmat`, `arma::sp_mat`).
- If an alias cannot be created, the matrix will be copied. Sparse types
cannot have aliases and will be copied.
* `MakeAlias(a, cube, rows, cols, slices, offset=0, strict=true)`
- Make `a` into an alias of `cube` with the given size.
- If `offset` is `0`, then the alias is identical: the first element of
`a` is the first element of `cube`. Otherwise, the first element of `a`
is the `offset`'th element of `cube`; elements in `cube` are ordered in
a [column-major way](../matrices.md#representing-data-in-mlpack).
- If `strict` is `true`, the size of `a` cannot be changed.
- `cube` and `a` should have the same cube type (e.g. `arma::cube`,
`arma::fcube`).
- If an alias cannot be created, the cube will be copied.
---
* `ClearAlias(a)`
- If `a` is an alias, reset `a` to an empty matrix, without modifying the
aliased memory. `a` is no longer an alias after this call.
---
* `UnwrapAlias(a, in)`
- If `in` is a matrix type (e.g. `arma::mat`), make `a` into an alias of
`in`.
- If `in` is not a matrix type, but instead, e.g., an Armadillo expression,
fill `a` with the results of the evaluated expression `in`.
- This can be used in place of, e.g., `a = in`, to avoid a copy when
possible.
- `a` should be a matrix type that matches the type of the expression or
matrix `in`.
## `Range`
The `Range` class represents a simple mathematical range (i.e. `[0, 3]`),
with the bounds represented as `double`s.
### Constructors
* `r = Range()`
- Construct an empty range.
* `r = Range(p)`
- Construct the range `[p, p]`.
* `r = Range(lo, hi)`
- Construct the range `[lo, hi]`.
### Accessing and modifying range properties
* `r.Lo()` and `r.Hi()` return the lower and upper bounds of the range as
`double`s.
- A range is considered empty if `r.Lo() > r.Hi()`.
- These can be used to modify the bounds, e.g., `r.Lo() = 3.0`.
* `r.Width()` returns the span of the range (i.e. `r.Hi() - r.Lo()`) as a
`double`.
* `r.Mid()` returns the midpoint of the range as a `double`.
### Working with ranges
* Given two ranges `r1` and `r2`,
- `r1 | r2` returns the union of the ranges,
- `r1 |= r2` expands `r1` to include the range `r2`,
- `r1 & r2` returns the intersection of the ranges (possibly an empty range),
- `r1 &= r2` shrinks `r1` to the intersection of `r1` and `r2`,
- `r1 == r2` returns `true` if the two ranges are strictly equal (i.e. lower
and upper bounds are equal),
- `r1 != r2` returns `true` if the two ranges are not strictly equal,
- `r1 < r2` returns `true` if `r1.Hi() < r2.Lo()`,
- `r1 > r2` returns `true` if `r1.Lo() > r2.Hi()`, and
- `r1.Contains(r2)` returns `true` if the ranges overlap at all.
* Given a range `r` and a `double` scalar `d`,
- `r * d` returns a new range `[d * r.Lo(), d * r.Hi()]`,
- `r *= d` scales `r.Lo()` and `r.Hi()` by `d`, and
- `r.Contains(d)` returns `true` if `d` is contained in the range.
---
* To use ranges with different element types (e.g. `float`), use the type
`RangeType<float>` or similar.
### Usage example
```c++
mlpack::Range r1(5.0, 6.0); // [5, 6]
mlpack::Range r2(7.0, 8.0); // [7, 8]
mlpack::Range r3 = r1 | r2; // [5, 8]
mlpack::Range r4 = r1 & r2; // empty range
bool b1 = r1.Contains(r2); // false
bool b2 = r1.Contains(5.5); // true
bool b3 = r1.Contains(r3); // true
bool b4 = r3.Contains(r4); // false
// Create a range of `float`s and a range of `int`s.
mlpack::RangeType<float> r5(1.0f, 1.5f); // [1.0, 1.5]
mlpack::RangeType<int> r6(3, 4); // [3, 4]
```
---
`Range` is used by:
* [`RangeSearch`](/src/mlpack/methods/range_search/range_search.hpp)
* [mlpack trees](../../developer/trees.md) <!-- TODO: link to local trees section -->
## `ColumnCovariance()`
* `ColumnCovariance(X, normType=0)`
- `X`: a [column-major](../matrices.md#representing-data-in-mlpack) data
matrix
- `normType`: either `0` or `1` (see below)
* Computes the covariance of the data matrix `X`.
* Equivalent to `arma::cov(X.t(), normType)`, but avoids computing the
transpose and is thus slightly more efficient.
* `normType` controls the type of normalization done when computing the
covariance:
- `0` will normalize with `X.n_cols - 1`, providing the best unbiased
estimation of the covariance matrix (if the columns are from a normal
distribution);
- `1` will normalize with `X.n_cols`, providing the second moment about the
mean of the columns.
* Any dense matrix type can be used so long as it supports the Armadillo API
(e.g., `arma::mat`, `arma::fmat`, etc.).
Example:
```c++
// Generate a random data matrix with 100 points in 5 dimensions.
arma::mat data(5, 100, arma::fill::randu);
// Compute the covariance matrix of the column-major matrix.
arma::mat cov = mlpack::ColumnCovariance(data);
cov.print("Covariance of random matrix:");
```
## `ColumnsToBlocks`
The `ColumnsToBlocks` class provides a way to transform data points (e.g.
columns in a matrix) into a block matrix format, primarily useful for
visualization as an image.
As a simple example, given a matrix with four columns `A`, `B`, `C`, and `D`,
`ColumnsToBlocks` can transform this matrix into the form
```
[[m m m m m]
[m A m B m]
[m m m m m]
[m C m D m]
[m m m m m]]
```
where `m` is a margin, and where each column may itself be reshaped into a
block.
### Constructors
* `ctb = ColumnsToBlocks(rows, cols)`
- Create a `ColumnsToBlocks` object that will reshape the input matrix into
blocks of shape `rows` by `cols`.
- Each input column will be reshaped into a square (e.g. `ctb.BlockHeight()`
and `ctb.BlockWidth()` are set to `0`).
* `ctb = ColumnsToBlocks(rows, cols, blockHeight, blockWidth)`
- Create a `ColumnsToBlocks` object that will reshape the input matrix into
blocks of shape `rows` by `cols`.
- Each individual column will also be reshaped into a block of shape
`blockHeight` by `blockWidth`.
### Properties
* `ctb.Rows(rows)` will set the number of rows in the block output to `rows`.
- `ctb.Rows()` will return a `size_t` with the current setting.
* `ctb.Cols(cols)` will set the number of columns in the block output to
`cols`.
- `ctb.Cols()` will return a `size_t` with the current setting.
* `ctb.BlockHeight(blockHeight)` will set the number of rows in each individual
block to `blockHeight`.
- `ctb.BlockHeight()` will return a `size_t` with the current setting.
- If `ctb.BlockHeight()` is `0`, each input column will be reshaped into a
square; if this is not possible, an exception will be thrown.
* `ctb.BlockWidth()` will set the number of columns in each individual block to
`blockWidth`.
- `ctb.BlockWidth()` will return a `size_t` with the current setting.
- If `ctb.BlockWidth()` is `0`, each input column will be reshaped into a
square; if this is not possible, an exception will be thrown.
* `ctb.BufSize(bufSize)` will set the number of margin elements to `bufSize`.
- `ctb.BufSize()` will return a `size_t` with the current setting.
- The default setting is `1`.
* `ctb.BufValue(bufValue)` will set the element used for margins to `bufValue`.
- `ctb.BufValue()` will return a `size_t` with the current setting.
- The default setting is `-1.0`.
### Scaling values
`ColumnsToBlocks` also has the capability of linearly scaling values of the
inputs to a given range.
* `ctb.Scale(true)` enables scaling values.
- By default scaling is disabled.
- `ctb.Scale(false)` will disable scaling.
- `ctb.Scale()` will return a `bool` indicating whether scaling is enabled.
* `ctb.MinRange(value)` sets the lower bound of the scaling range to `value`.
- `ctb.MinRange()` returns the current value as a `double`.
* `ctb.MaxRange(value)` sets the upper bound of the scaling range to `value`.
- `ctb.MaxRange()` returns the current value as a `double`.
- Must be greater than `ctb.MinRange()`, if `ctb.Scale() == true`.
***Note:*** the margin element (`ctb.BufValue()`) is considered during the
scaling process.
### Transforming into block format
* `ctb.Transform(input, output)` will perform the columns-to-blocks
transformation on the given matrix `input`, storing the result in the matrix
`output`.
- An exception will be thrown if `input.n_rows` is not equal to
`ctb.BlockHeight() * ctb.BlockWidth()` (if neither of those are `0`).
- If either `ctb.BlockHeight()` or `ctb.BlockWidth()` is `0`, each column
will be reshaped into a square, and an exception will be thrown if
`input.n_rows` is not a perfect square (i.e. if `sqrt(input.n_rows)` is not
an integer).
### Examples
Reshape two 4-element vectors into one row of two blocks.
```c++
// This matrix has two columns.
arma::mat input;
input = { { -1.0000, 0.1429 },
{ -0.7143, 0.4286 },
{ -0.4286, 0.7143 },
{ -0.1429, 1.0000 } };
input.print("Input columns:");
arma::mat output;
mlpack::ColumnsToBlocks ctb(1, 2);
ctb.Transform(input, output);
// The columns of the input will be reshaped as a square which is
// surrounded by padding value -1 (this value could be changed with the
// BufValue() method):
// -1.0000 -1.0000 -1.0000 -1.0000 -1.0000 -1.0000 -1.0000
// -1.0000 -1.0000 -0.4286 -1.0000 0.1429 0.7143 -1.0000
// -1.0000 -0.7143 -0.1429 -1.0000 0.4286 1.0000 -1.0000
// -1.0000 -1.0000 -1.0000 -1.0000 -1.0000 -1.0000 -1.0000
output.print("Output using 2x2 block size:");
// Now, let's change some parameters; let's have each input column output not
// as a square, but as a 4x1 vector.
ctb.BlockWidth(1);
ctb.BlockHeight(4);
ctb.Transform(input, output);
// The output here will be similar, but each maximal input is 4x1:
// -1.0000 -1.0000 -1.0000 -1.0000 -1.0000
// -1.0000 -1.0000 -1.0000 0.1429 -1.0000
// -1.0000 -0.7143 -1.0000 0.4286 -1.0000
// -1.0000 -0.4286 -1.0000 0.7143 -1.0000
// -1.0000 -0.1429 -1.0000 1.0000 -1.0000
// -1.0000 -1.0000 -1.0000 -1.0000 -1.0000
output.print("Output using 4x1 block size:");
```
---
Load simple images and reshape into blocks.
```c++
// Load some favicons from websites associated with mlpack.
std::vector<std::string> images;
// See the following files:
// - https://datasets.mlpack.org/images/mlpack-favicon.png
// - https://datasets.mlpack.org/images/ensmallen-favicon.png
// - https://datasets.mlpack.org/images/armadillo-favicon.png
// - https://datasets.mlpack.org/images/bandicoot-favicon.png
images.push_back("mlpack-favicon.png");
images.push_back("ensmallen-favicon.png");
images.push_back("armadillo-favicon.png");
images.push_back("bandicoot-favicon.png");
mlpack::data::ImageInfo info;
info.Channels() = 1; // Force loading in grayscale.
arma::mat matrix;
mlpack::data::Load(images, matrix, info, true);
// Now `matrix` has 4 columns, each of which is an individual image.
// Let's save that as its own image just for visualization.
mlpack::data::ImageInfo outInfo(matrix.n_cols, matrix.n_rows, 1);
mlpack::data::Save("favicons-matrix.png", matrix, outInfo, true);
// Use ColumnsToBlocks to create a 2x2 block matrix holding each image.
mlpack::ColumnsToBlocks ctb(2, 2);
ctb.BufValue(0.0); // Use 0 for the margin value.
ctb.BufSize(2); // Use 2-pixel margins.
arma::mat blocks;
ctb.Transform(matrix, blocks);
mlpack::data::ImageInfo blockOutInfo(blocks.n_cols, blocks.n_rows, 1);
mlpack::data::Save("favicons-blocks.png", blocks, blockOutInfo, true);
```
The resulting images (before and after using `ColumnsToBlocks`) are shown below.
*Before*:
<center>
<img src="../../img/favicons-matrix.png" alt="four favicons each as a column in a matrix, unintelligible">
</center>
*After*:
<center>
<img src="../../img/favicons-blocks.png" alt="four favicons each as a block in a larger image, much better">
</center>
### See also
* [Loading and saving image data](../load_save.md#image-data)
* [`SparseAutoencoder`](/src/mlpack/methods/sparse_autoencoder/sparse_autoencoder.hpp)
## Distribution utilities
* `Digamma(x)` returns the logarithmic derivative of the gamma function (see
[Wikipedia](https://en.wikipedia.org/wiki/Digamma_function)).
- `x` should have type `double`.
- The return type is `double`.
* `Trigamma(x)` returns the
[trigamma function](https://en.wikipedia.org/wiki/Trigamma_function) at the
value `x`.
- `x` should have type `double`.
- The return type is `double`.
* Both of these functions are used internally by the
[`GammaDistribution`](distributions.md#gammadistribution) class.
*Example*:
```
const double d1 = mlpack::Digamma(0.25);
const double d2 = mlpack::Digamma(1.0);
const double t1 = mlpack::Trigamma(0.25);
const double t2 = mlpack::Trigamma(1.0);
std::cout << "Digamma(0.25): " << d1 << "." << std::endl;
std::cout << "Digamma(1.0): " << d2 << "." << std::endl;
std::cout << "Trigamma(0.25): " << t1 << "." << std::endl;
std::cout << "Trigamma(1.0): " << t2 << "." << std::endl;
```
## `RandVector()`
* `RandVector(v)` generates a random vector on the unit sphere (i.e. with an
L2-norm of 1) and stores it in the vector `v`.
* `v` should be a dense floating-point Armadillo vector (e.g. `arma::vec` or
`arma::fvec`).
* The [Box-Muller transform](https://en.wikipedia.org/wiki/Box-Muller_transform)
is used to generate the vector.
* `v` is not resized, and should have size equal to the desired dimensionality
when `RandVector()` is called.
*Example*:
```
// Generate a random 10-dimensional vector.
arma::vec v;
v.set_size(10);
RandVector(v);
v.print("Random 10-dimensional vector: ");
std::cout << "Random 10-dimensional vector: " << std::endl;
std::cout << v.t();
std::cout << "L2-norm of vector (should be 1): " << arma::norm(v, 2) << "."
<< std::endl;
```
## Logarithmic utilities
mlpack contains a few functions that are useful for working with logarithms, or
vectors containing logarithms.
* `LogAdd(x, y)` for scalars `x` and `y` (e.g. `double`, `float`, `int`, etc.)
will return `log(e^x + e^y)`.
* `AccuLog(v)`, given a vector `v` containing log values, will return the
scalar log-sum of those values:
`log(e^(v[0]) + e^(v[1]) + ... + e^(v[v.n_elem - 1]))`.
---
* `LogSumExp(m, out)`, given a matrix `m` (`arma::mat`) containing log values,
will compute the scalar log-sum of each *column*, storing the result in the
column vector `out` (type `arma::vec`).
- `out` will be set to size `m.n_cols`.
- `out[i]` will be equal to `AccuLog(m.col(i))`.
- Different element types can be used for `m` and `out` (e.g. `arma::fmat`
and `arma::fvec`).
* `LogSumExpT(m, out)`, given a matrix `m` (type `arma::mat`) containing log
values, will compute the scalar log-sum of each *row*, storing the result in
the column vector `out` (type `arma::vec`)
- `out` will be set to size `m.n_rows`.
- `out[i]` will be equal to `AccuLog(m.row(i))`.
- Different element types can be used for `m` and `out` (e.g. `arma::fmat`
and `arma::fvec`).
---
* `LogSumExp<eT, true>(m, out)` performs an incremental sum, otherwise
identical to `LogSumExp()`.
- The input values of `out` are not ignored.
- `out[i]` will be equal to `log(e^(out[i]) + e^(AccuLog(m.col(i))))`.
- `eT` represents the element type of `m` and `out` (e.g., `double` if `m` is
`arma::mat` and `out` is `arma::vec`).
* `LogSumExpT<eT, true>(m, out)` performs an incremental sum, otherwise
identical to `LogSumExpT()`.
- The input values of `out` are not ignored.
- `out[i]` will be equal to `log(e^(out[i]) + e^(AccuLog(m.row(i))))`.
- `eT` represents the element type of `m` and `out` (e.g., `double` if `m` is
`arma::mat` and `out` is `arma::vec`).
## `MultiplyCube2Cube()`
* `z = MultiplyCube2Cube(x, y, transX=false, transY=false)`
- Inputs `x` and `y` are cubes (e.g. `arma::cube`), and must have the same
number of slices
- `z` is a cube whose slices are the slices of `x` and `y` multiplied
- `transX` and `transY` indicate whether each slice of `x` and `y` should be
transposed before multiplication.
* If `transX` and `transY` are `false`, then
`z.slice(i) = x.slice(i) * y.slice(i)`.
* If `transX` is `false` and `transY` is `true`, then
`z.slice(i) = x.slice(i) * y.slice(i).t()`.
* The inner dimensions of `x` and `y` must match for multiplication, or an
exception will be thrown.
*Example usage:*
```c++
// Generate two random cubes.
arma::cube x(10, 100, 5, arma::fill::randu); // 5 matrices, each 10x100.
arma::cube y(12, 100, 5, arma::fill::randu); // 5 matrices, each 12x100.
arma::cube z = mlpack::MultiplyCube2Cube(x, y, false, true);
// Output size should be 10x12x5.
std::cout << "Output size: " << z.n_rows << "x" << z.n_cols << "x" << z.n_slices
<< "." << std::endl;
```
## `MultiplyMat2Cube()`
* `z = MultiplyMat2Cube(x, y, transX=false, transY=false)`
- Input `x` is a matrix and `y` is a cube (e.g. `arma::cube`).
- `z` is a cube whose slices are `x` multiplied by the slices of `y`.
- `transX` and `transY` indicate whether `x` and each slice of `y` should be
transposed before multiplication.
* If `transX` and `transY` are `false`, then `z.slice(i) = x * y.slice(i)`.
* If `transX` is `false` and `transY` is `true`, then
`z.slice(i) = x * y.slice(i).t()`.
* The inner dimensions of `x` and `y` must match for multiplication, or an
exception will be thrown.
*Example usage:*
```c++
// Generate random inputs.
arma::mat x(10, 100, arma::fill::randu); // Random 10x100 matrix.
arma::cube y(12, 100, 5, arma::fill::randu); // 5 matrices, each 12x100.
arma::cube z = mlpack::MultiplyMat2Cube(x, y, false, true);
// Output size should be 10x12x5.
std::cout << "Output size: " << z.n_rows << "x" << z.n_cols << "x" << z.n_slices
<< "." << std::endl;
```
## `MultiplyCube2Mat()`
* `z = MultiplyCube2Mat(x, y, transX=false, transY=false)`
- Input `x` is a cube (e.g. `arma::cube`) and `y` is a matrix.
- `z` is a cube whose slices are the slices of `x` multiplied with `y`.
- `transX` and `transY` indicate whether each slice of `x` and `y` should be
transposed before multiplication.
* If `transX` and `transY` are `false`, then `z.slice(i) = x.slice(i) * y`.
* If `transX` is `true` and `transY` is `false`, then
`z.slice(i) = x.slice(i).t() * y`.
* The inner dimensions of `x` and `y` must match for multiplication, or an
exception will be thrown.
*Example usage:*
```c++
// Generate two random cubes.
arma::cube x(12, 50, 5, arma::fill::randu); // 5 matrices, each 12x50.
arma::mat y(12, 60, arma::fill::randu); // Random 12x60 matrix.
arma::cube z = mlpack::MultiplyCube2Mat(x, y, true, false);
// Output size should be 50x60x5.
std::cout << "Output size: " << z.n_rows << "x" << z.n_cols << "x" << z.n_slices
<< "." << std::endl;
```
## `Quantile()`
* Compute the quantile function of the Gaussian distribution at the given
probability.
* `double q = Quantile(p, mu=0.0, sigma=1.0)`
- `q` is the computed quantile.
- `p` is the probability to compute the quantile of (between 0 and 1).
- `mu` is the (optional) mean of the Gaussian distribution.
- `sigma` is the (optional) standard deviation of the Gaussian distribution.
- All arguments are `double`s.
* See also [Quantile function on Wikipedia](https://en.wikipedia.org/wiki/Quantile_function).
*Example usage:*
```c++
// 70% of points from N(0, 1) are less than q1 = 0.524.
double q1 = mlpack::Quantile(0.7);
// 90% of points from N(0, 1) are less than q2 = 1.282.
double q2 = mlpack::Quantile(0.9);
// 50% of points from N(1, 1) are less than q3 = 1.0.
double q3 = mlpack::Quantile(0.5, 1.0); // Quantile of 1.0 for N(1, 1) is 1.0.
// 10% of points from N(1, 0.1) are less than q4 = 0.871.
double q4 = mlpack::Quantile(0.1, 1.0, 0.1);
std::cout << "Quantile(0.7): " << q1 << "." << std::endl;
std::cout << "Quantile(0.9): " << q2 << "." << std::endl;
std::cout << "Quantile(0.5, 1.0): " << q3 << "." << std::endl;
std::cout << "Quantile(0.1, 1.0, 0.1): " << q4 << "." << std::endl;
```
## RNG and random number utilities
On top of the random number generation support that Armadillo provides via
[randu()](https://arma.sourceforge.net/docs.html#randu),
[randn()](https://arma.sourceforge.net/docs.html#randn), and
[randi()](https://arma.sourceforge.net/docs.html#randi), mlpack provides
a few additional thread-safe random number generation functions for generating
random scalar values.
* `RandomSeed(seed)` will set the random seed of mlpack's RNGs ***and***
Armadillo's RNG to `seed`.
- This internally calls `arma::arma_rng::set_seed()`.
- In a multithreaded application, each thread's RNG will be deterministically
set to a different value based on `seed`.
* `Random()` returns a random `double` uniformly distributed between `0` and
`1`, *not including 1*.
* `Random(lo, hi)` returns a random `double` uniformly distributed between `lo`
and `hi`, *not including `hi`*.
* `RandBernoulli(p)` samples from a Bernoulli distribution with parameter `p`:
with probability `p`, `1` is returned; with probability `1 - p`, `0` is
returned.
* `RandInt(hiExclusive)` returns a random `int` uniformly distributed in the
range `[0, hiExclusive)`.
* `RandInt(lo, hiExclusive)` returns a random `int` uniformly distributed in
the range `[lo, hiExclusive)`.
* `RandNormal()` returns a random `double` normally distributed with mean `0`
and standard deviation `1`.
* `RandNormal(mean, stddev)` returns a random `double` normally distributed
with mean `mean` and standard deviation `stddev`.
*Examples*:
```c++
mlpack::RandomSeed(123); // Set a specific random seed.
const double r1 = mlpack::Random(); // In the range [0, 1).
const double r2 = mlpack::Random(3, 4); // In the range [3, 4).
const double r3 = mlpack::RandBernoulli(0.25); // P(1) = 0.25.
const int r4 = mlpack::RandInt(10); // In the range [0, 10).
const int r5 = mlpack::RandInt(5, 10); // In the range [5, 10).
const double r6 = mlpack::RandNormal(); // r6 ~ N(0, 1).
const double r7 = mlpack::RandNormal(2.0, 3.0); // r7 ~ N(2, 3).
std::cout << "Random(): " << r1 << "." << std::endl;
std::cout << "Random(3, 4): " << r2 << "." << std::endl;
std::cout << "RandBernoulli(0.25): " << r3 << "." << std::endl;
std::cout << "RandInt(10): " << r4 << "." << std::endl;
std::cout << "RandInt(5, 10): " << r5 << "." << std::endl;
std::cout << "RandNormal(): " << r6 << "." << std::endl;
std::cout << "RandNormal(2, 3): " << r7 << "." << std::endl;
```
## `RandomBasis()`
The `RandomBasis()` function generates a random d-dimensional orthogonal basis.
* `RandomBasis(basis, d)` fills the matrix `basis` with `d` orthogonal vectors,
each of dimension `d`.
- `basis.col(i)` represents the `i`th basis vector.
- `basis` will have size `d` rows by `d` cols.
* The random basis is generated using the QR decomposition.
*Example*:
```c++
arma::mat basis;
// Generate a 10-dimensional random basis.
mlpack::RandomBasis(basis, 10);
// Each two vectors are orthogonal.
std::cout << "Dot product of basis vectors 2 and 4: "
<< arma::dot(basis.col(2), basis.col(4))
<< " (should be zero or very close!)." << std::endl;
```
## `ShuffleData()`
Shuffle a [column-major](../matrices.md#representing-data-in-mlpack) dataset and
associated labels/responses, optionally with weights. This preserves the
connection of each data point to its label (and optionally its weight).
* `ShuffleData(inputData, inputLabels, outputData, outputLabels)`
- Randomly permute data points and labels from `inputData` and `inputLabels`
into `outputData` and `outputLabels`.
- `outputData` will be set to the same size as `inputData`.
- `outputLabels` will be set to the same size as `inputLabels`.
- `inputData` can be a dense matrix, a sparse matrix, or a cube, with any
element type. (That is, `inputData` may have type `arma::mat`,
`arma::fmat`, `arma::sp_mat`, `arma::cube`, etc.)
- `inputLabels` must be a dense vector type but may hold any element type
(e.g. `arma::Row<size_t>`, `arma::uvec`, `arma::vec`, etc.).
- `outputData` must have the same type as `inputData`, and `outputLabels`
must have the same type as `inputLabels`.
* `ShuffleData(inputData, inputLabels, inputWeights, outputData, outputLabels, outputWeights)`
- Identical to the previous overload, but also handles weights via
`inputWeights` and `outputWeights`.
- `inputWeights` must be a dense vector type but may hold any element type
(e.g. `arma::rowvec`, `arma::frowvec`, `arma::vec`, etc.)
- `outputWeights` must have the same type as `inputWeights`.
***Note:*** when `inputData` is a cube (e.g. `arma::cube` or similar), the
columns of the cube will be shuffled.
*Example usage:*
```c++
// See https://datasets.mlpack.org/iris.csv.
arma::mat dataset;
mlpack::data::Load("iris.csv", dataset, true);
// See https://datasets.mlpack.org/iris.labels.csv.
arma::Row<size_t> labels;
mlpack::data::Load("iris.labels.csv", labels, true);
// Now shuffle the points in the iris dataset.
arma::mat shuffledDataset;
arma::Row<size_t> shuffledLabels;
mlpack::ShuffleData(dataset, labels, shuffledDataset, shuffledLabels);
std::cout << "Before shuffling, the first point was: " << std::endl;
std::cout << " " << dataset.col(0).t();
std::cout << "with label " << labels[0] << "." << std::endl;
std::cout << std::endl;
std::cout << "After shuffling, the first point is: " << std::endl;
std::cout << " " << shuffledDataset.col(0).t();
std::cout << "with label " << shuffledLabels[0] << "." << std::endl;
// Generate random weights, then shuffle those also.
arma::rowvec weights(dataset.n_cols, arma::fill::randu);
arma::rowvec shuffledWeights;
mlpack::ShuffleData(dataset, labels, weights, shuffledDataset, shuffledLabels,
shuffledWeights);
std::cout << std::endl << std::endl;
std::cout << "Before shuffling with weights, the first point was: "
<< std::endl;
std::cout << " " << dataset.col(0).t();
std::cout << "with label " << labels[0] << " and weight " << weights[0] << "."
<< std::endl;
std::cout << std::endl;
std::cout << "After shuffling with weights, the first point is: " << std::endl;
std::cout << " " << shuffledDataset.col(0).t();
std::cout << "with label " << shuffledLabels[0] << " and weight "
<< shuffledWeights[0] << "." << std::endl;
```
|