File: coef_var.Rd

package info (click to toggle)
r-cran-datawizard 1.0.1%2Bdfsg-1
  • links: PTS, VCS
  • area: main
  • in suites: forky, sid, trixie
  • size: 2,300 kB
  • sloc: sh: 13; makefile: 2
file content (93 lines) | stat: -rw-r--r-- 3,584 bytes parent folder | download
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/descriptives.R
\name{coef_var}
\alias{coef_var}
\alias{distribution_cv}
\alias{distribution_coef_var}
\alias{coef_var.numeric}
\title{Compute the coefficient of variation}
\usage{
coef_var(x, ...)

distribution_coef_var(x, ...)

\method{coef_var}{numeric}(
  x,
  mu = NULL,
  sigma = NULL,
  method = c("standard", "unbiased", "median_mad", "qcd"),
  trim = 0,
  remove_na = FALSE,
  n = NULL,
  ...
)
}
\arguments{
\item{x}{A numeric vector of ratio scale (see details), or vector of values than can be coerced to one.}

\item{...}{Further arguments passed to computation functions.}

\item{mu}{A numeric vector of mean values to use to compute the coefficient
of variation. If supplied, \code{x} is not used to compute the mean.}

\item{sigma}{A numeric vector of standard deviation values to use to compute the coefficient
of variation. If supplied, \code{x} is not used to compute the SD.}

\item{method}{Method to use to compute the CV. Can be \code{"standard"} to compute
by dividing the standard deviation by the mean, \code{"unbiased"} for the
unbiased estimator for normally distributed data, or one of two robust
alternatives: \code{"median_mad"} to divide the median by the \code{\link[stats:mad]{stats::mad()}},
or \code{"qcd"} (quartile coefficient of dispersion, interquartile range divided
by the sum of the quartiles [twice the midhinge]: \eqn{(Q_3 - Q_1)/(Q_3 + Q_1)}.}

\item{trim}{the fraction (0 to 0.5) of values to be trimmed from
each end of \code{x} before the mean and standard deviation (or other measures)
are computed. Values of \code{trim} outside the range of (0 to 0.5) are taken
as the nearest endpoint.}

\item{remove_na}{Logical. Should \code{NA} values be removed before computing (\code{TRUE})
or not (\code{FALSE}, default)?}

\item{n}{If \code{method = "unbiased"} and both \code{mu} and \code{sigma} are provided (not
computed from \code{x}), what sample size to use to adjust the computed CV
for small-sample bias?}
}
\value{
The computed coefficient of variation for \code{x}.
}
\description{
Compute the coefficient of variation (CV, ratio of the standard deviation to
the mean, \eqn{\sigma/\mu}) for a set of numeric values.
}
\details{
CV is only applicable of values taken on a ratio scale: values that have a
\emph{fixed} meaningfully defined 0 (which is either the lowest or highest
possible value), and that ratios between them are interpretable For example,
how many sandwiches have I eaten this week? 0 means "none" and 20 sandwiches
is 4 times more than 5 sandwiches. If I were to center the number of
sandwiches, it will no longer be on a ratio scale (0 is no "none" it is the
mean, and the ratio between 4 and -2 is not meaningful). Scaling a ratio
scale still results in a ratio scale. So I can re define "how many half
sandwiches did I eat this week ( = sandwiches * 0.5) and 0 would still mean
"none", and 20 half-sandwiches is still 4 times more than 5 half-sandwiches.

This means that CV is \strong{NOT} invariant to shifting, but it is to scaling:

\if{html}{\out{<div class="sourceCode r">}}\preformatted{sandwiches <- c(0, 4, 15, 0, 0, 5, 2, 7)
coef_var(sandwiches)
#> [1] 1.239094

coef_var(sandwiches / 2) # same
#> [1] 1.239094

coef_var(sandwiches + 4) # different! 0 is no longer meaningful!
#> [1] 0.6290784
}\if{html}{\out{</div>}}
}
\examples{
coef_var(1:10)
coef_var(c(1:10, 100), method = "median_mad")
coef_var(c(1:10, 100), method = "qcd")
coef_var(mu = 10, sigma = 20)
coef_var(mu = 10, sigma = 20, method = "unbiased", n = 30)
}