File: README.md

package info (click to toggle)
ruby-distribution 0.8.0%2Bdfsg-2
  • links: PTS, VCS
  • area: main
  • in suites: sid
  • size: 624 kB
  • sloc: ruby: 4,535; makefile: 10
file content (155 lines) | stat: -rw-r--r-- 6,411 bytes parent folder | download | duplicates (2)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
# [Distribution](https://github.com/sciruby/distribution)

[![Build Status](https://travis-ci.org/SciRuby/distribution.svg?branch=master)](https://travis-ci.org/SciRuby/distribution)
[![Code Climate](https://codeclimate.com/github/SciRuby/distribution/badges/gpa.svg)](https://codeclimate.com/github/SciRuby/distribution)

Distribution is a gem with several probabilistic distributions. Pure Ruby is used by default, C (GSL) or Java extensions are used if available. Some facts:

- Very fast ruby 1.9.3+ implementation, with improved method to calculate factorials and other common functions.
- All methods tested on several ranges. See `spec/`.
- Code for normal, Student's t and chi square is lifted from the [statistics2 gem](https://rubygems.org/gems/statistics2). Originally at [this site](http://blade.nagaokaut.ac.jp/~sinara/ruby/math/statistics2).
- The code for some functions and RNGs was lifted from Julia's [Rmath-julia](https://github.com/JuliaLang/Rmath-julia), a patched version of R's standalone math library.

The following table lists the available distributions and the methods available for each one. If a field is marked with an *x*, that distribution doesn't have that method implemented.

| Distribution     | PDF | CDF | Quantile | RNG | Mean | Mode | Variance | Skewness | Kurtosis | Entropy |
| :--------------- | :-: | :-: | :------: | :-: | :--: | :--: | :------: | :------: | :------: | :-----: |
| Uniform          |     |     |          |     | x    | x    | x        | x        | x        | x       |
| Normal           |     |     |          |     | x    | x    | x        | x        | x        | x       |
| Lognormal        |     |     | x        | x   | x    | x    | x        | x        | x        | x       |
| Bivariate Normal |     |     | x        | x   | x    | x    | x        | x        | x        | x       |
| Exponential      |     |     |          |     | x    | x    | x        | x        | x        | x       |
| Logistic         |     |     |          |     | x    | x    | x        | x        | x        | x       |
| Student's T      |     |     |          | x   | x    | x    | x        | x        | x        | x       |
| Chi Square       |     |     |          |     | x    | x    | x        | x        | x        | x       |
| Fisher-Snedecor  |     |     |          | x   | x    | x    | x        | x        | x        | x       |
| Beta             |     |     |          | x   | x    | x    | x        | x        | x        | x       |
| Gamma            |     |     | x        | x   | x    | x    | x        | x        | x        | x       |
| Weibull          |     |     |          | x   | x    | x    | x        | x        | x        | x       |
| Binomial         |     |     |          | x   | x    | x    | x        | x        | x        | x       |
| Poisson          |     |     |          |     | x    | x    | x        | x        | x        | x       |
| Hypergeometric   |     |     |          | x   | x    | x    | x        | x        | x        | x       |

## Installation

```
$ gem install distribution
```

You can install GSL for better performance:

* For Mac OS X: `brew install gsl`
* For Ubuntu / Debian: `sudo apt-get install libgsl0-dev`

After successfully installing the library:

```bash
$ gem install rb-gsl
```

## Examples

You can find automatically generated documentation on [RubyDoc](http://www.rubydoc.info/github/sciruby/distribution/master).

```
# Returns Gaussian PDF for x.
pdf = Distribution::Normal.pdf(x)

# Returns Gaussian CDF for x.
cdf = Distribution::Normal.cdf(x)

# Returns inverse CDF (or p-value) for x.
pv = Distribution::Normal.p_value(x)

# API.

# You would normally use the following
p = Distribution::T.cdf(x)

# to get the cumulative probability of `x`. However, you can also:

include Distribution::Shorthand
tdist_cdf(x)
```

## API Structure

```ruby
Distribution::<name>.(cdf|pdf|p_value|rng)
```

On discrete distributions, exact Ruby implementations of pdf, cdf and p_value could be provided, using

```
  Distribution::<name>.exact_(cdf|pdf|p_value)
```

module Distribution::Shorthand provides (you guess?) shortands method to call all methods

```
  <Distribution shortname>_(cdf|pdf|p|r)
```

On discrete distributions, exact cdf, pdf and p_value are

```
  <Distribution shortname>_(ecdf|epdf|ep)
```

Shortnames for distributions:

  * Normal: norm
  * Bivariate Normal: bnor
  * T: tdist
  * F: fdist
  * Chi Square: chisq
  * Binomial: bino
  * Hypergeometric: hypg
  * Exponential: expo
  * Poisson: pois
  * Beta: beta
  * Gamma: gamma
  * LogNormal: lognormal
  * Uniform: unif

## Roadmap

This gem wasn't updated for a long time before I started working on it, so there are a lot of work to do. The first priority is cleaning the interface and removing cruft whenever possible. After that, I want to implement more distributions and make sure that each one has a RNG.

### Short-term

- Define a minimal interface for continuous and discrete distributions (e.g. mean, variance, mode, skewness, kurtosis, pdf, cdf, quantile, cquantile).
- Implement `Distribution::Uniform` with the default Ruby `Random`.
- Clean up the implementation of normal distribution. Implement the necessary functions.
- The same for Student's t, chi square, Fisher-Snedecor, beta, gamma, lognormal, logistic.
- The same for discrete distributions: binomial, hypergeometric, bernoulli (still missing), etc.

### Medium-term

- Implement [DSFMT](http://www.math.sci.hiroshima-u.ac.jp/~m-mat/MT/SFMT/) for the uniform random generator.
- Cauchy distribution.

### Long-term

- Implementing everything in the distributions x functions table above.

## Issues

* On JRuby and Rubinius, BivariateNormal returns incorrect pdf

For current issues see the [issue tracker pages](https://github.com/sciruby/distribution/issues).

## OMG! I want to help!

Everyone is welcome to help! Please, test these distributions with your own use
cases and give a shout on the issue tracker if you find a problem or something
is strange or hard to use. Documentation pull requests are totally welcome.
More generally, any ideas or suggestions are welcome -- even by private e-mail.

If you want to provide a new distribution, run `lib/distribution`:

```
$ distribution --new your_distribution
```

This should create the main distribution file, the directory with Ruby and GSL engines and specs on the spec/ directory.