File: p_function.R

package info (click to toggle)
r-cran-parameters 0.24.2-2
  • links: PTS, VCS
  • area: main
  • in suites: forky, sid, trixie
  • size: 3,852 kB
  • sloc: sh: 16; makefile: 2
file content (494 lines) | stat: -rw-r--r-- 21,977 bytes parent folder | download
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
#' @title p-value or consonance function
#' @name p_function
#'
#' @description Compute p-values and compatibility (confidence) intervals for
#' statistical models, at different levels. This function is also called
#' consonance function. It allows to see which estimates are compatible with
#' the model at various compatibility levels. Use `plot()` to generate plots
#' of the _p_ resp. _consonance_ function and compatibility intervals at
#' different levels.
#'
#' @param ci_levels Vector of scalars, indicating the different levels at which
#' compatibility intervals should be printed or plotted. In plots, these levels
#' are highlighted by vertical lines. It is possible to increase thickness for
#' one or more of these lines by providing a names vector, where the to be
#' highlighted values should be named `"emph"`, e.g
#' `ci_levels = c(0.25, 0.5, emph = 0.95)`.
#'
#' @inheritParams model_parameters
#' @inheritParams model_parameters.default
#' @inheritParams model_parameters.glmmTMB
#' @inheritParams standard_error
#'
#' @note
#' Curently, `p_function()` computes intervals based on Wald t- or z-statistic.
#' For certain models (like mixed models), profiled intervals may be more
#' accurate, however, this is currently not supported.
#'
#' @seealso See also [`equivalence_test()`] and [`p_significance()`] for
#' functions related to checking effect existence and significance.
#'
#' @details
#' ## Compatibility intervals and continuous _p_-values for different estimate values
#'
#' `p_function()` only returns the compatibility interval estimates, not the
#' related _p_-values. The reason for this is because the _p_-value for a
#' given estimate value is just `1 - CI_level`. The values indicating the lower
#' and upper limits of the intervals are the related estimates associated with
#' the _p_-value. E.g., if a parameter `x` has a 75% compatibility interval
#' of `(0.81, 1.05)`, then the _p_-value for the estimate value of `0.81`
#' would be `1 - 0.75`, which is `0.25`. This relationship is more intuitive and
#' better to understand when looking at the plots (using `plot()`).
#'
#' ## Conditional versus unconditional interpretation of _p_-values and intervals
#'
#' `p_function()`, and in particular its `plot()` method, aims at re-interpreting
#' _p_-values and confidence intervals (better named: _compatibility_ intervals)
#' in _unconditional_ terms. Instead of referring to the long-term property and
#' repeated trials when interpreting interval estimates (so-called "aleatory
#' probability", _Schweder 2018_), and assuming that all underlying assumptions
#' are correct and met, `p_function()` interprets _p_-values in a Fisherian way
#' as "_continuous_ measure of evidence against the very test hypothesis _and_
#' entire model (all assumptions) used to compute it"
#' (*P-Values Are Tough and S-Values Can Help*, lesslikely.com/statistics/s-values;
#' see also _Amrhein and Greenland 2022_).
#'
#' The common definition of p-values can be considered as "conditional"
#' interpretation:
#'
#' _The p-value is the probability of obtaining test results at least as
#' extreme as the result actually observed, under the assumption that the
#' null hypothesis is correct (Wikipedia)._
#'
#' However, this definition or interpretation is inadequate because it only
#' refers to the test hypothesis (often the null hypothesis), which is only
#' one component of the entire model that is being tested. Thus,
#' _Greenland et al. 2022_ suggest an "unconditional" interpretation.
#'
#' This interpretation as a continuous measure of evidence against the test
#' hypothesis and the entire model used to compute it can be seen in the
#' figure below (taken from *P-Values Are Tough and S-Values Can Help*,
#' lesslikely.com/statistics/s-values). The "conditional" interpretation of
#' _p_-values and interval estimates (A) implicitly assumes certain assumptions
#' to be true, thus the interpretation is "conditioned" on these assumptions
#' (i.e. assumptions are taken as given, only the hypothesis is tested). The
#' unconditional interpretation (B), however, questions _all_ these assumptions.
#'
#' A non-significant p-value could occur because the test hypothesis is false,
#' but could also be the result of any of the model assumptions being incorrect.
#'
#' \if{html}{\cr \figure{unconditional_interpretation.png}{options: alt="Conditional versus unconditional interpretations of P-values"} \cr}
#'
#' "Emphasizing unconditional interpretations helps avoid overconfident and
#' misleading inferences in light of uncertainties about the assumptions used
#' to arrive at the statistical results." (_Greenland et al. 2022_).
#'
#' **Note:** The term "conditional" as used by Rafi and Greenland probably has
#' a slightly different meaning than normally. "Conditional" in this notion
#' means that all model assumptions are taken as given - it should not be
#' confused with terms like "conditional probability". See also _Greenland et al. 2022_
#' for a detailed elaboration on this issue.
#'
#' In other words, the term compatibility interval emphasizes "the dependence
#' of the _p_-value on the assumptions as well as on the data, recognizing that
#' _p_<0.05 can arise from assumption violations even if the effect under
#' study is null" (_Gelman/Greenland 2019_).
#'
#' ## Probabilistic interpretation of p-values and compatibility intervals
#'
#' Schweder (2018) resp. Schweder and Hjort (2016) (and others) argue that
#' confidence curves (as produced by `p_function()`) have a valid probabilistic
#' interpretation. They distinguish between _aleatory probability_, which
#' describes the aleatory stochastic element of a distribution _ex ante_, i.e.
#' before the data are obtained. This is the classical interpretation of
#' confidence intervals following the Neyman-Pearson school of statistics.
#' However, there is also an _ex post_ probability, called _epistemic_ probability,
#' for confidence curves. The shift in terminology from _confidence_ intervals
#' to _compatibility_ intervals may help emphasizing this interpretation.
#'
#' In this sense, the probabilistic interpretation of _p_-values and
#' compatibility intervals is "conditional" - on the data _and_ model assumptions
#' (which is in line with the _"unconditional"_ interpretation in the sense of
#' Rafi and Greenland).
#'
#' Ascribing a probabilistic interpretation to one realized confidence interval
#' is possible without repeated sampling of the specific experiment. Important
#' is the assumption that a _sampling distribution_ is a good description of the
#' variability of the parameter (_Vos and Holbert 2022_). At the core, the
#' interpretation of a confidence interval is "I assume that this sampling
#' distribution is a good description of the uncertainty of the parameter. If
#' that's a good assumption, then the values in this interval are the most
#' plausible or compatible with the data". The source of confidence in
#' probability statements is the assumption that the selected sampling
#' distribution is appropriate.
#'
#' "The realized confidence distribution is clearly an epistemic probability
#' distribution" (_Schweder 2018_). In Bayesian words, compatibility intervals
#' (or confidence distributons, or consonance curves) are "posteriors without
#' priors" (_Schweder, Hjort, 2003_).
#'
#' The _p_-value indicates the degree of compatibility of the endpoints of the
#' interval at a given confidence level with (1) the observed data and (2) model
#' assumptions. The observed point estimate (_p_-value = 1) is the value
#' estimated to be _most compatible_ with the data and model assumptions,
#' whereas values values far from the observed point estimate (where _p_
#' approaches 0) are least compatible with the data and model assumptions
#' (_Schweder and Hjort 2016, pp. 60-61; Amrhein and Greenland 2022_). In this
#' regards, _p_-values are statements about _confidence_ or _compatibility_:
#' The p-value is not an absolute measure of evidence for a model (such as the
#' null/alternative model), it is a continuous measure of the compatibility of
#' the observed data with the model used to compute it (_Greenland et al. 2016_,
#' _Greenland 2023_). Going one step further, and following _Schweder_, p-values
#' can be considered as _epistemic probability_ - "not necessarily of the
#' hypothesis being true, but of it _possibly_ being true" (_Schweder 2018_).
#' Hence, the interpretation of _p_-values might be guided using
#' [`bayestestR::p_to_pd()`].
#'
#' ## Probability or compatibility?
#'
#' We here presented the discussion of p-values and confidence intervals from the
#' perspective of two paradigms, one saying that probability statements can be
#' made, one saying that interpretation is guided in terms of "compatibility".
#' Cox and Hinkley say, "interval estimates cannot be taken as probability
#' statements" (_Cox and Hinkley 1979: 208_), which conflicts with the Schweder
#' and Hjort confidence distribution school. However, if you view interval
#' estimates as being intervals of values being consistent with the data,
#' this comes close to the idea of (epistemic) probability. We do not believe that
#' these two paradigms contradict or exclude each other. Rather, the aim is to
#' emphasize the one point of view or the other, i.e. to place the linguistic
#' nuances either on 'compatibility' or 'probability'.
#'
#' The main take-away is *not* to interpret p-values as dichotomous decisions
#' that distinguish between "we found an effect" (statistically significant)" vs.
#' "we found no effect" (statistically not significant) (_Altman and Bland 1995_).
#'
#' ## Compatibility intervals - is their interpretation "conditional" or not?
#'
#' The fact that the term "conditional" is used in different meanings in
#' statistics, is confusing. Thus, we would summarize the (probabilistic)
#' interpretation of compatibility intervals as follows: The intervals are built
#' from the data _and_ our modeling assumptions. The accuracy of the intervals
#' depends on our model assumptions. If a value is outside the interval, that
#' might be because (1) that parameter value isn't supported by the data, or (2)
#' the modeling assumptions are a poor fit for the situation. When we make bad
#' assumptions, the compatibility interval might be too wide or (more commonly
#' and seriously) too narrow, making us think we know more about the parameter
#' than is warranted.
#'
#' When we say "there is a 95% chance the true value is in the interval", that is
#' a statement of _epistemic probability_ (i.e. description of uncertainty related
#' to our knowledge or belief). When we talk about repeated samples or sampling
#' distributions, that is referring to _aleatoric_ (physical properties) probability.
#' Frequentist inference is built on defining estimators with known _aleatoric_
#' probability properties, from which we can draw _epistemic_ probabilistic
#' statements of uncertainty (_Schweder and Hjort 2016_).
#'
#' ## Functions in the parameters package to check for effect existence and significance
#'
#' The **parameters** package provides several options or functions to aid
#' statistical inference. Beyond `p_function()`, there are, for example:
#' - [`equivalence_test()`][equivalence_test.lm], to compute the (conditional)
#'   equivalence test for frequentist models
#' - [`p_significance()`][p_significance.lm], to compute the probability of
#'   *practical significance*, which can be conceptualized as a unidirectional
#'   equivalence test
#' - the `pd` argument (setting `pd = TRUE`) in `model_parameters()` includes
#'   a column with the *probability of direction*, i.e. the probability that a
#'   parameter is strictly positive or negative. See [`bayestestR::p_direction()`]
#'   for details. If plotting is desired, the [`p_direction()`][p_direction.lm]
#'   function can be used, together with `plot()`.
#' - the `s_value` argument (setting `s_value = TRUE`) in `model_parameters()`
#'   replaces the p-values with their related _S_-values (*Rafi and Greenland 2020*)
#' - finally, it is possible to generate distributions of model coefficients by
#'   generating bootstrap-samples (setting `bootstrap = TRUE`) or simulating
#'   draws from model coefficients using [`simulate_model()`]. These samples
#'   can then be treated as "posterior samples" and used in many functions from
#'   the **bayestestR** package.
#'
#' @return A data frame with p-values and compatibility intervals.
#'
#' @references
#' - Altman DG, Bland JM. Absence of evidence is not evidence of absence. BMJ.
#'   1995;311(7003):485. \doi{10.1136/bmj.311.7003.485}
#'
#' - Amrhein V, Greenland S. Discuss practical importance of results based on
#'   interval estimates and p-value functions, not only on point estimates and
#'   null p-values. Journal of Information Technology 2022;37:316–20.
#'   \doi{10.1177/02683962221105904}
#'
#' - Cox DR, Hinkley DV. 1979. Theoretical Statistics. 6th edition.
#'   Chapman and Hall/CRC
#'
#' - Fraser DAS. The P-value function and statistical inference. The American
#'   Statistician. 2019;73(sup1):135-147. \doi{10.1080/00031305.2018.1556735}
#'
#' - Gelman A, Greenland S. Are confidence intervals better termed "uncertainty
#'   intervals"? BMJ (2019)l5381. \doi{10.1136/bmj.l5381}
#'
#' - Greenland S, Rafi Z, Matthews R, Higgs M. To Aid Scientific Inference,
#'   Emphasize Unconditional Compatibility Descriptions of Statistics. (2022)
#'   https://arxiv.org/abs/1909.08583v7 (Accessed November 10, 2022)
#'
#' - Greenland S, Senn SJ, Rothman KJ, Carlin JB, Poole C, Goodman SN, et al.
#'   (2016). Statistical tests, P values, confidence intervals, and power: A
#'   guide to misinterpretations. European Journal of Epidemiology. 31:337-350.
#'   \doi{10.1007/s10654-016-0149-3}
#'
#' - Greenland S (2023). Divergence versus decision P-values: A distinction
#'   worth making in theory and keeping in practice: Or, how divergence P-values
#'   measure evidence even when decision P-values do not. Scand J Statist, 50(1),
#'   54-88.
#'
#' - Rafi Z, Greenland S. Semantic and cognitive tools to aid statistical
#'   science: Replace confidence and significance by compatibility and surprise.
#'   BMC Medical Research Methodology. 2020;20(1):244. \doi{10.1186/s12874-020-01105-9}
#'
#' - Schweder T. Confidence is epistemic probability for empirical science.
#'   Journal of Statistical Planning and Inference (2018) 195:116–125.
#'   \doi{10.1016/j.jspi.2017.09.016}
#'
#' - Schweder T, Hjort NL. Confidence and Likelihood. Scandinavian Journal of
#'   Statistics. 2002;29(2):309-332. \doi{10.1111/1467-9469.00285}
#'
#' - Schweder T, Hjort NL. Frequentist analogues of priors and posteriors.
#'   In Stigum, B. (ed.), Econometrics and the Philosophy of Economics: Theory
#'   Data Confrontation in Economics, pp. 285-217. Princeton University Press,
#'   Princeton, NJ, 2003
#'
#' - Schweder T, Hjort NL. Confidence, Likelihood, Probability: Statistical
#'   inference with confidence distributions. Cambridge University Press, 2016.
#'
#' - Vos P, Holbert D. Frequentist statistical inference without repeated sampling.
#'   Synthese 200, 89 (2022). \doi{10.1007/s11229-022-03560-x}
#'
#' @examplesIf requireNamespace("see")
#' model <- lm(Sepal.Length ~ Species, data = iris)
#' p_function(model)
#'
#' model <- lm(mpg ~ wt + as.factor(gear) + am, data = mtcars)
#' result <- p_function(model)
#'
#' # single panels
#' plot(result, n_columns = 2)
#'
#' # integrated plot, the default
#' plot(result)
#' @export
p_function <- function(model,
                       ci_levels = c(0.25, 0.5, 0.75, emph = 0.95),
                       exponentiate = FALSE,
                       effects = "fixed",
                       component = "all",
                       vcov = NULL,
                       vcov_args = NULL,
                       keep = NULL,
                       drop = NULL,
                       verbose = TRUE,
                       ...) {
  # degrees of freedom
  dof <- insight::get_df(model, type = "wald")

  # standard errors
  se <- standard_error(
    model,
    effects = effects,
    component = component,
    vcov = vcov,
    vcov_args = vcov_args
  )$SE

  if (is.null(dof) || length(dof) == 0 || .is_chi2_model(model, dof)) {
    dof <- Inf
  }

  x <- do.call(rbind, lapply(seq(0, 1, 0.01), function(i) {
    suppressMessages(.ci_dof(
      model,
      ci = i,
      dof,
      effects,
      component,
      method = "wald",
      se = se,
      vcov = NULL,
      vcov_args = NULL,
      verbose = TRUE
    ))
  }))

  # data for plotting
  out <- x[!is.infinite(x$CI_low) & !is.infinite(x$CI_high), ]
  out$CI <- round(out$CI, 2)

  # most plausible value (point estimate)
  point_estimate <- out$CI_low[which.min(out$CI)]

  if (!is.null(keep) || !is.null(drop)) {
    out <- .filter_parameters(out,
      keep = keep,
      drop = drop,
      verbose = verbose
    )
  }

  # transform non-Gaussian
  if (isTRUE(exponentiate)) {
    out$CI_low <- exp(out$CI_low)
    out$CI_high <- exp(out$CI_high)
  }

  # data for p_function ribbon
  data_ribbon <- datawizard::data_to_long(
    out,
    select = c("CI_low", "CI_high"),
    values_to = "x"
  )

  # data for vertical CI level lines
  out <- out[out$CI %in% ci_levels, ]
  out$group <- 1

  # emphasize focal hypothesis line
  emphasize <- which(names(ci_levels) == "emph")
  if (length(emphasize)) {
    out$group[out$CI == ci_levels[emphasize]] <- 2
  }

  attr(out, "data") <- data_ribbon
  attr(out, "point_estimate") <- point_estimate
  attr(out, "pretty_names") <- suppressWarnings(format_parameters(model, ...))

  class(out) <- c("parameters_p_function", "see_p_function", "data.frame")
  out
}

#' @rdname p_function
#' @export
consonance_function <- p_function

#' @rdname p_function
#' @export
confidence_curve <- p_function


# methods ----------------------


#' @export
plot.parameters_p_function <- function(x, ...) {
  insight::check_if_installed("see")
  NextMethod()
}


#' @export
format.parameters_p_function <- function(x,
                                         digits = 2,
                                         format = NULL,
                                         ci_width = NULL,
                                         ci_brackets = TRUE,
                                         pretty_names = TRUE,
                                         ...) {
  # print
  dat <- lapply(split(x, x$CI), function(i) {
    ci <- as.character(i$CI)[1]
    out <- datawizard::data_rename(
      i,
      select = c("CI_low", "CI_high"),
      replacement = c(sprintf("CI_low_%s", ci), sprintf("CI_high_%s", ci))
    )
    out$CI <- NULL
    out$group <- NULL
    out
  })

  out <- do.call(datawizard::data_merge, list(dat, by = "Parameter"))
  attr(out, "pretty_names") <- attributes(x)$pretty_names

  insight::format_table(
    out,
    digits = digits,
    ci_width = ci_width,
    ci_brackets = ci_brackets,
    format = format,
    pretty_names = pretty_names
  )
}


#' @export
print.parameters_p_function <- function(x,
                                        digits = 2,
                                        ci_width = "auto",
                                        ci_brackets = TRUE,
                                        pretty_names = TRUE,
                                        ...) {
  cat(.print_p_function(
    x,
    digits,
    ci_width,
    ci_brackets,
    pretty_names = pretty_names,
    format = "text",
    ...
  ))
}


#' @export
print_md.parameters_p_function <- function(x,
                                           digits = 2,
                                           ci_width = "auto",
                                           ci_brackets = c("(", ")"),
                                           pretty_names = TRUE,
                                           ...) {
  .print_p_function(x, digits, ci_width, ci_brackets, pretty_names, format = "markdown", ...)
}


#' @export
print_html.parameters_p_function <- function(x,
                                             digits = 2,
                                             ci_width = "auto",
                                             ci_brackets = c("(", ")"),
                                             pretty_names = TRUE,
                                             ...) {
  .print_p_function(x, digits, ci_width, ci_brackets, pretty_names, format = "html", ...)
}


# helper ----------

.print_p_function <- function(x,
                              digits = 2,
                              ci_width = "auto",
                              ci_brackets = c("(", ")"),
                              pretty_names = TRUE,
                              format = "html",
                              ...) {
  formatted_table <- format(
    x,
    digits = digits,
    format = format,
    ci_width = ci_width,
    ci_brackets = ci_brackets,
    pretty_names = pretty_names,
    ...
  )

  insight::export_table(
    formatted_table,
    format = format,
    caption = "Consonance Function",
    ...
  )
}

# model <- lm(Sepal.Length ~ Species, data = iris)
# for later use: highlight p-value for secific parameter estimate values
# stat <- insight::get_statistic(model)
# se <- parameters::standard_error(model)
# estimate to test against - compute p-value for specific estimate
# null_estimate <- 1.5
# p <- 2 * stats::pt(abs(stat$Statistic[3]) - (null_estimate / se$SE[3]), df = 147, lower.tail = FALSE)
# bayestestR::p_to_pd(p)