| 12
 3
 4
 5
 6
 7
 8
 9
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
 100
 101
 102
 103
 104
 105
 106
 107
 108
 109
 110
 111
 112
 113
 114
 115
 116
 117
 118
 119
 120
 121
 122
 123
 124
 125
 126
 127
 128
 129
 130
 131
 132
 133
 134
 135
 136
 137
 138
 139
 140
 141
 142
 143
 144
 145
 146
 147
 148
 149
 150
 151
 152
 153
 154
 155
 156
 157
 158
 159
 160
 161
 162
 163
 164
 165
 166
 167
 168
 169
 170
 171
 172
 173
 174
 175
 176
 177
 178
 179
 180
 181
 182
 183
 184
 185
 186
 187
 188
 189
 190
 191
 192
 193
 194
 195
 196
 197
 198
 199
 200
 201
 202
 203
 204
 205
 206
 207
 208
 209
 210
 211
 212
 213
 214
 215
 216
 217
 218
 219
 220
 221
 222
 223
 224
 225
 226
 227
 228
 229
 230
 231
 232
 233
 234
 235
 236
 237
 238
 239
 240
 241
 242
 243
 244
 245
 
 | ---
title: "Effect Sizes for Contingency Tables"
output: 
  rmarkdown::html_vignette:
    toc: true
    fig_width: 10.08
    fig_height: 6
tags: [r, effect size, rules of thumb, guidelines, conversion]
vignette: >
  \usepackage[utf8]{inputenc}
  %\VignetteIndexEntry{Effect Sizes for Contingency Tables}
  %\VignetteEngine{knitr::rmarkdown}
editor_options: 
  chunk_output_type: console
bibliography: bibliography.bib
---
```{r setup, include=FALSE}
library(knitr)
options(knitr.kable.NA = "")
knitr::opts_chunk$set(comment = ">")
options(digits = 3)
set.seed(7)
.eval_if_requireNamespace <- function(...) {
  pkgs <- c(...)
  knitr::opts_chunk$get("eval") &&
    all(vapply(pkgs, requireNamespace, TRUE, quietly = TRUE))
}
```
This vignette provides a review of effect sizes for 1- and 2-D contingency
tables, which are typically analysed with `chisq.test()` and `fisher.test()`.
```{r}
library(effectsize)
options(es.use_symbols = TRUE) # get nice symbols when printing! (On Windows, requires R >= 4.2.0)
```
# Nominal Correlation
## 2-by-2 Tables
For 2-by-2 contingency tables, $\phi$ (Phi) is homologous (though directionless) to the biserial correlation between two dichotomous variables, with 0 representing no association, and 1 representing a perfect association. 
```{r}
(MPG_Gear <- table(mtcars$mpg < 20, mtcars$vs))
phi(MPG_Gear, adjust = FALSE)
# Same as:
cor(mtcars$mpg < 20, mtcars$vs)
```
A "cousin" effect size is Pearson's contingency coefficient, however it is not a true measure of correlation, but rather a type of normalized $\chi^2$ (see `chisq_to_pearsons_c()`):
```{r}
pearsons_c(MPG_Gear)
```
## Larger Tables
For larger contingency tables, Cramér's *V*, Tschuprow's *T*, Cohen's *w*, and Pearson's *C* can be used.
Both Cramér's *V* and Tschuprow's *T* are true measures of correlation: they range from 0 to 1, with 0 indicating complete independence between the nominal variables, and 1 indicating complete dependence. For square tables, they are equal, however for non-square tables $T < V$; While Cramér's *V* defines complete dependence as "it is possible to know exactly the value of the columns from the rows *or* know exactly the value of the rows from the columns", Tschuprow's *T* required both to be true to achieve complete dependence - something that is not possible for non-square tables. For example:
```{r}
data("food_class")
food_class
```
In this case, if you know the food product, you know if it is vegan or not, but
knowing if the food is vegan or not will not always let you know what food
product it is.
```{r}
cramers_v(food_class, adjust = FALSE)
tschuprows_t(food_class, adjust = FALSE)
```
Cohen's *w* and Pearson's *C* are not true measures of correlation, but are two
types of normalized $\chi^2$ values. While Pearson's *C* is capped at 1, Cohen's
*w* can be larger than 1 (for both, 0 indicates no association between the
variables).
```{r}
data("Music_preferences2")
Music_preferences2
chisq.test(Music_preferences2)
cramers_v(Music_preferences2)
tschuprows_t(Music_preferences2)
pearsons_c(Music_preferences2)
cohens_w(Music_preferences2) # > 1
```
(Cramer's *V*, Tschuprow's *T*, and Cohen's *w* can also be used for 2-by-2 tables, but there they are equivalent to $\phi$.)
## For a Bayesian $\chi^2$-test
A Bayesian estimate of these effect sizes can also be provided based on
`BayesFactor`'s version of a $\chi^2$-test via the `effectsize()` function:
```{r, eval = .eval_if_requireNamespace("BayesFactor"), message=FALSE}
library(BayesFactor)
BFX <- contingencyTableBF(MPG_Gear, sampleType = "jointMulti")
effectsize(BFX, type = "phi") # for 2 * 2
BFX <- contingencyTableBF(Music_preferences2, sampleType = "jointMulti")
effectsize(BFX, type = "cramers_v")
effectsize(BFX, type = "tschuprows_t")
effectsize(BFX, type = "cohens_w")
effectsize(BFX, type = "pearsons_c")
```
# Goodness of Fit
Cohen's *w* and Pearson's *C* are also applicable to tests of goodness-of-fit,
where they indicate the degree of deviation from the hypothetical probabilities,
with 0 reflecting no deviation.
```{r}
O <- c(89, 37, 130, 28, 2) # observed group sizes
E <- c(.40, .20, .20, .15, .05) # expected group freq
chisq.test(O, p = E)
pearsons_c(O, p = E)
cohens_w(O, p = E)
```
However, these values are not (anti)correlations - they do not properly adjust for the "shape" of the expected multinational distribution, making them inflated (additionally, Cohen's *w* can be larger than 1, making it harder to interpret).
For these reasons, we recommend the $פ$ (Fei) coefficient, which is a measure of anti-correlation between the observed and the expected distributions, ranging from 0 (observed distribution matches the expected distribution perfectly) and 1 (the observed distribution is maximally different than the expected one).
```{r}
fei(O, p = E)
# Observed perfectly matches Expected
(O1 <- E * 286)
fei(O1, p = E)
# Observed deviates maximally from Expected:
# All observed values are in the least expected class!
(O2 <- c(rep(0, 4), 286))
fei(O2, p = E)
```
# Differences in Proportions
For 2-by-2 tables, we can also compute the Odds-ratio (OR), where each column
represents a different *group*. Values larger than 1 indicate that the odds are
higher in the first group (and vice versa).
```{r}
data("RCT_table")
RCT_table
chisq.test(RCT_table) # or fisher.test(RCT_table)
oddsratio(RCT_table)
```
We can also compute the Risk-ratio (RR), which is the ratio between the
proportions of the two groups, and the Absolute Risk Reduction (ARR), which is
the *difference* between the proportions of the two groups - both are measures
which some claim to be more intuitive.
```{r}
riskratio(RCT_table)
arr(RCT_table)
```
Additionally, Cohen's *h* can also be computed, which uses the *arcsin*
transformation. Negative values indicate smaller proportion in the first group
(and vice versa).
```{r}
cohens_h(RCT_table)
```
## For a Bayesian $\chi^2$-test
A Bayesian estimate of these effect sizes can also be provided based on
`BayesFactor`'s version of a $\chi^2$-test via the `effectsize()` function:
```{r, eval = .eval_if_requireNamespace("BayesFactor")}
BFX <- contingencyTableBF(RCT_table, sampleType = "jointMulti")
effectsize(BFX, type = "or")
effectsize(BFX, type = "rr")
effectsize(BFX, type = "cohens_h")
```
# Paired Contingency Tables
For dependent (paired) contingency tables, Cohen's *g* represents the symmetry
of the table, ranging between 0 (perfect symmetry) and 0.5 (perfect asymmetry).
For example, these two tests seem to be equally predictive of the disease they are screening:
```{r}
data("screening_test")
phi(screening_test$Diagnosis, screening_test$Test1)
phi(screening_test$Diagnosis, screening_test$Test2)
```
Does this mean they give the same number of positive/negative results?
```{r}
tests <- table(Test1 = screening_test$Test1, Test2 = screening_test$Test2)
tests
mcnemar.test(tests)
cohens_g(tests)
```
Test 1 gives more positive results than test 2!
# References
 |