1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428
|
% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/p_function.R
\name{p_function}
\alias{p_function}
\alias{consonance_function}
\alias{confidence_curve}
\title{p-value or consonance function}
\usage{
p_function(
model,
ci_levels = c(0.25, 0.5, 0.75, emph = 0.95),
exponentiate = FALSE,
effects = "fixed",
component = "all",
vcov = NULL,
vcov_args = NULL,
keep = NULL,
drop = NULL,
verbose = TRUE,
...
)
consonance_function(
model,
ci_levels = c(0.25, 0.5, 0.75, emph = 0.95),
exponentiate = FALSE,
effects = "fixed",
component = "all",
vcov = NULL,
vcov_args = NULL,
keep = NULL,
drop = NULL,
verbose = TRUE,
...
)
confidence_curve(
model,
ci_levels = c(0.25, 0.5, 0.75, emph = 0.95),
exponentiate = FALSE,
effects = "fixed",
component = "all",
vcov = NULL,
vcov_args = NULL,
keep = NULL,
drop = NULL,
verbose = TRUE,
...
)
}
\arguments{
\item{model}{Statistical Model.}
\item{ci_levels}{Vector of scalars, indicating the different levels at which
compatibility intervals should be printed or plotted. In plots, these levels
are highlighted by vertical lines. It is possible to increase thickness for
one or more of these lines by providing a names vector, where the to be
highlighted values should be named \code{"emph"}, e.g
\code{ci_levels = c(0.25, 0.5, emph = 0.95)}.}
\item{exponentiate}{Logical, indicating whether or not to exponentiate the
coefficients (and related confidence intervals). This is typical for
logistic regression, or more generally speaking, for models with log or
logit links. It is also recommended to use \code{exponentiate = TRUE} for models
with log-transformed response values. For models with a log-transformed
response variable, when \code{exponentiate = TRUE}, a one-unit increase in the
predictor is associated with multiplying the outcome by that predictor's
coefficient. \strong{Note:} Delta-method standard errors are also computed (by
multiplying the standard errors by the transformed coefficients). This is
to mimic behaviour of other software packages, such as Stata, but these
standard errors poorly estimate uncertainty for the transformed
coefficient. The transformed confidence interval more clearly captures this
uncertainty. For \code{compare_parameters()}, \code{exponentiate = "nongaussian"}
will only exponentiate coefficients from non-Gaussian families.}
\item{effects}{Should parameters for fixed effects (\code{"fixed"}), random
effects (\code{"random"}), both fixed and random effects (\code{"all"}), or the
overall (sum of fixed and random) effects (\code{"random_total"}) be returned?
Only applies to mixed models. May be abbreviated. If the calculation of
random effects parameters takes too long, you may use \code{effects = "fixed"}.}
\item{component}{Which type of parameters to return, such as parameters for the
conditional model, the zero-inflation part of the model, the dispersion
term, or other auxiliary parameters be returned? Applies to models with
zero-inflation and/or dispersion formula, or if parameters such as \code{sigma}
should be included. May be abbreviated. Note that the \emph{conditional}
component is also called \emph{count} or \emph{mean} component, depending on the
model. There are three convenient shortcuts: \code{component = "all"} returns
all possible parameters. If \code{component = "location"}, location parameters
such as \code{conditional}, \code{zero_inflated}, or \code{smooth_terms}, are returned
(everything that are fixed or random effects - depending on the \code{effects}
argument - but no auxiliary parameters). For \code{component = "distributional"}
(or \code{"auxiliary"}), components like \code{sigma}, \code{dispersion}, or \code{beta}
(and other auxiliary parameters) are returned.}
\item{vcov}{Variance-covariance matrix used to compute uncertainty estimates
(e.g., for robust standard errors). This argument accepts a covariance
matrix, a function which returns a covariance matrix, or a string which
identifies the function to be used to compute the covariance matrix.
\itemize{
\item A covariance matrix
\item A function which returns a covariance matrix (e.g., \code{stats::vcov()})
\item A string which indicates the kind of uncertainty estimates to return.
\itemize{
\item Heteroskedasticity-consistent: \code{"HC"}, \code{"HC0"}, \code{"HC1"}, \code{"HC2"},
\code{"HC3"}, \code{"HC4"}, \code{"HC4m"}, \code{"HC5"}. See \code{?sandwich::vcovHC}
\item Cluster-robust: \code{"CR"}, \code{"CR0"}, \code{"CR1"}, \code{"CR1p"}, \code{"CR1S"},
\code{"CR2"}, \code{"CR3"}. See \code{?clubSandwich::vcovCR}
\item Bootstrap: \code{"BS"}, \code{"xy"}, \code{"residual"}, \code{"wild"}, \code{"mammen"},
\code{"fractional"}, \code{"jackknife"}, \code{"norm"}, \code{"webb"}. See
\code{?sandwich::vcovBS}
\item Other \code{sandwich} package functions: \code{"HAC"}, \code{"PC"}, \code{"CL"}, \code{"OPG"},
\code{"PL"}.
}
}}
\item{vcov_args}{List of arguments to be passed to the function identified by
the \code{vcov} argument. This function is typically supplied by the
\strong{sandwich} or \strong{clubSandwich} packages. Please refer to their
documentation (e.g., \code{?sandwich::vcovHAC}) to see the list of available
arguments. If no estimation type (argument \code{type}) is given, the default
type for \code{"HC"} equals the default from the \strong{sandwich} package; for type
\code{"CR"}, the default is set to \code{"CR3"}.}
\item{keep}{Character containing a regular expression pattern that
describes the parameters that should be included (for \code{keep}) or excluded
(for \code{drop}) in the returned data frame. \code{keep} may also be a
named list of regular expressions. All non-matching parameters will be
removed from the output. If \code{keep} is a character vector, every parameter
name in the \emph{"Parameter"} column that matches the regular expression in
\code{keep} will be selected from the returned data frame (and vice versa,
all parameter names matching \code{drop} will be excluded). Furthermore, if
\code{keep} has more than one element, these will be merged with an \code{OR}
operator into a regular expression pattern like this: \code{"(one|two|three)"}.
If \code{keep} is a named list of regular expression patterns, the names of the
list-element should equal the column name where selection should be
applied. This is useful for model objects where \code{model_parameters()}
returns multiple columns with parameter components, like in
\code{\link[=model_parameters.lavaan]{model_parameters.lavaan()}}. Note that the regular expression pattern
should match the parameter names as they are stored in the returned data
frame, which can be different from how they are printed. Inspect the
\verb{$Parameter} column of the parameters table to get the exact parameter
names.}
\item{drop}{See \code{keep}.}
\item{verbose}{Toggle warnings and messages.}
\item{...}{Arguments passed to or from other methods. Non-documented
arguments are
\itemize{
\item \code{digits}, \code{p_digits}, \code{ci_digits} and \code{footer_digits} to set the number of
digits for the output. \code{groups} can be used to group coefficients. These
arguments will be passed to the print-method, or can directly be used in
\code{print()}, see documentation in \code{\link[=print.parameters_model]{print.parameters_model()}}.
\item If \code{s_value = TRUE}, the p-value will be replaced by the S-value in the
output (cf. \emph{Rafi and Greenland 2020}).
\item \code{pd} adds an additional column with the \emph{probability of direction} (see
\code{\link[bayestestR:p_direction]{bayestestR::p_direction()}} for details). Furthermore, see 'Examples' in
\code{\link[=model_parameters.default]{model_parameters.default()}}.
\item For developers, whose interest mainly is to get a "tidy" data frame of
model summaries, it is recommended to set \code{pretty_names = FALSE} to speed
up computation of the summary table.
}}
}
\value{
A data frame with p-values and compatibility intervals.
}
\description{
Compute p-values and compatibility (confidence) intervals for
statistical models, at different levels. This function is also called
consonance function. It allows to see which estimates are compatible with
the model at various compatibility levels. Use \code{plot()} to generate plots
of the \emph{p} resp. \emph{consonance} function and compatibility intervals at
different levels.
}
\details{
\subsection{Compatibility intervals and continuous \emph{p}-values for different estimate values}{
\code{p_function()} only returns the compatibility interval estimates, not the
related \emph{p}-values. The reason for this is because the \emph{p}-value for a
given estimate value is just \code{1 - CI_level}. The values indicating the lower
and upper limits of the intervals are the related estimates associated with
the \emph{p}-value. E.g., if a parameter \code{x} has a 75\% compatibility interval
of \verb{(0.81, 1.05)}, then the \emph{p}-value for the estimate value of \code{0.81}
would be \code{1 - 0.75}, which is \code{0.25}. This relationship is more intuitive and
better to understand when looking at the plots (using \code{plot()}).
}
\subsection{Conditional versus unconditional interpretation of \emph{p}-values and intervals}{
\code{p_function()}, and in particular its \code{plot()} method, aims at re-interpreting
\emph{p}-values and confidence intervals (better named: \emph{compatibility} intervals)
in \emph{unconditional} terms. Instead of referring to the long-term property and
repeated trials when interpreting interval estimates (so-called "aleatory
probability", \emph{Schweder 2018}), and assuming that all underlying assumptions
are correct and met, \code{p_function()} interprets \emph{p}-values in a Fisherian way
as "\emph{continuous} measure of evidence against the very test hypothesis \emph{and}
entire model (all assumptions) used to compute it"
(\emph{P-Values Are Tough and S-Values Can Help}, lesslikely.com/statistics/s-values;
see also \emph{Amrhein and Greenland 2022}).
The common definition of p-values can be considered as "conditional"
interpretation:
\emph{The p-value is the probability of obtaining test results at least as
extreme as the result actually observed, under the assumption that the
null hypothesis is correct (Wikipedia).}
However, this definition or interpretation is inadequate because it only
refers to the test hypothesis (often the null hypothesis), which is only
one component of the entire model that is being tested. Thus,
\emph{Greenland et al. 2022} suggest an "unconditional" interpretation.
This interpretation as a continuous measure of evidence against the test
hypothesis and the entire model used to compute it can be seen in the
figure below (taken from \emph{P-Values Are Tough and S-Values Can Help},
lesslikely.com/statistics/s-values). The "conditional" interpretation of
\emph{p}-values and interval estimates (A) implicitly assumes certain assumptions
to be true, thus the interpretation is "conditioned" on these assumptions
(i.e. assumptions are taken as given, only the hypothesis is tested). The
unconditional interpretation (B), however, questions \emph{all} these assumptions.
A non-significant p-value could occur because the test hypothesis is false,
but could also be the result of any of the model assumptions being incorrect.
\if{html}{\cr \figure{unconditional_interpretation.png}{options: alt="Conditional versus unconditional interpretations of P-values"} \cr}
"Emphasizing unconditional interpretations helps avoid overconfident and
misleading inferences in light of uncertainties about the assumptions used
to arrive at the statistical results." (\emph{Greenland et al. 2022}).
\strong{Note:} The term "conditional" as used by Rafi and Greenland probably has
a slightly different meaning than normally. "Conditional" in this notion
means that all model assumptions are taken as given - it should not be
confused with terms like "conditional probability". See also \emph{Greenland et al. 2022}
for a detailed elaboration on this issue.
In other words, the term compatibility interval emphasizes "the dependence
of the \emph{p}-value on the assumptions as well as on the data, recognizing that
\emph{p}<0.05 can arise from assumption violations even if the effect under
study is null" (\emph{Gelman/Greenland 2019}).
}
\subsection{Probabilistic interpretation of p-values and compatibility intervals}{
Schweder (2018) resp. Schweder and Hjort (2016) (and others) argue that
confidence curves (as produced by \code{p_function()}) have a valid probabilistic
interpretation. They distinguish between \emph{aleatory probability}, which
describes the aleatory stochastic element of a distribution \emph{ex ante}, i.e.
before the data are obtained. This is the classical interpretation of
confidence intervals following the Neyman-Pearson school of statistics.
However, there is also an \emph{ex post} probability, called \emph{epistemic} probability,
for confidence curves. The shift in terminology from \emph{confidence} intervals
to \emph{compatibility} intervals may help emphasizing this interpretation.
In this sense, the probabilistic interpretation of \emph{p}-values and
compatibility intervals is "conditional" - on the data \emph{and} model assumptions
(which is in line with the \emph{"unconditional"} interpretation in the sense of
Rafi and Greenland).
Ascribing a probabilistic interpretation to one realized confidence interval
is possible without repeated sampling of the specific experiment. Important
is the assumption that a \emph{sampling distribution} is a good description of the
variability of the parameter (\emph{Vos and Holbert 2022}). At the core, the
interpretation of a confidence interval is "I assume that this sampling
distribution is a good description of the uncertainty of the parameter. If
that's a good assumption, then the values in this interval are the most
plausible or compatible with the data". The source of confidence in
probability statements is the assumption that the selected sampling
distribution is appropriate.
"The realized confidence distribution is clearly an epistemic probability
distribution" (\emph{Schweder 2018}). In Bayesian words, compatibility intervals
(or confidence distributons, or consonance curves) are "posteriors without
priors" (\emph{Schweder, Hjort, 2003}).
The \emph{p}-value indicates the degree of compatibility of the endpoints of the
interval at a given confidence level with (1) the observed data and (2) model
assumptions. The observed point estimate (\emph{p}-value = 1) is the value
estimated to be \emph{most compatible} with the data and model assumptions,
whereas values values far from the observed point estimate (where \emph{p}
approaches 0) are least compatible with the data and model assumptions
(\emph{Schweder and Hjort 2016, pp. 60-61; Amrhein and Greenland 2022}). In this
regards, \emph{p}-values are statements about \emph{confidence} or \emph{compatibility}:
The p-value is not an absolute measure of evidence for a model (such as the
null/alternative model), it is a continuous measure of the compatibility of
the observed data with the model used to compute it (\emph{Greenland et al. 2016},
\emph{Greenland 2023}). Going one step further, and following \emph{Schweder}, p-values
can be considered as \emph{epistemic probability} - "not necessarily of the
hypothesis being true, but of it \emph{possibly} being true" (\emph{Schweder 2018}).
Hence, the interpretation of \emph{p}-values might be guided using
\code{\link[bayestestR:pd_to_p]{bayestestR::p_to_pd()}}.
}
\subsection{Probability or compatibility?}{
We here presented the discussion of p-values and confidence intervals from the
perspective of two paradigms, one saying that probability statements can be
made, one saying that interpretation is guided in terms of "compatibility".
Cox and Hinkley say, "interval estimates cannot be taken as probability
statements" (\emph{Cox and Hinkley 1979: 208}), which conflicts with the Schweder
and Hjort confidence distribution school. However, if you view interval
estimates as being intervals of values being consistent with the data,
this comes close to the idea of (epistemic) probability. We do not believe that
these two paradigms contradict or exclude each other. Rather, the aim is to
emphasize the one point of view or the other, i.e. to place the linguistic
nuances either on 'compatibility' or 'probability'.
The main take-away is \emph{not} to interpret p-values as dichotomous decisions
that distinguish between "we found an effect" (statistically significant)" vs.
"we found no effect" (statistically not significant) (\emph{Altman and Bland 1995}).
}
\subsection{Compatibility intervals - is their interpretation "conditional" or not?}{
The fact that the term "conditional" is used in different meanings in
statistics, is confusing. Thus, we would summarize the (probabilistic)
interpretation of compatibility intervals as follows: The intervals are built
from the data \emph{and} our modeling assumptions. The accuracy of the intervals
depends on our model assumptions. If a value is outside the interval, that
might be because (1) that parameter value isn't supported by the data, or (2)
the modeling assumptions are a poor fit for the situation. When we make bad
assumptions, the compatibility interval might be too wide or (more commonly
and seriously) too narrow, making us think we know more about the parameter
than is warranted.
When we say "there is a 95\% chance the true value is in the interval", that is
a statement of \emph{epistemic probability} (i.e. description of uncertainty related
to our knowledge or belief). When we talk about repeated samples or sampling
distributions, that is referring to \emph{aleatoric} (physical properties) probability.
Frequentist inference is built on defining estimators with known \emph{aleatoric}
probability properties, from which we can draw \emph{epistemic} probabilistic
statements of uncertainty (\emph{Schweder and Hjort 2016}).
}
\subsection{Functions in the parameters package to check for effect existence and significance}{
The \strong{parameters} package provides several options or functions to aid
statistical inference. Beyond \code{p_function()}, there are, for example:
\itemize{
\item \code{\link[=equivalence_test.lm]{equivalence_test()}}, to compute the (conditional)
equivalence test for frequentist models
\item \code{\link[=p_significance.lm]{p_significance()}}, to compute the probability of
\emph{practical significance}, which can be conceptualized as a unidirectional
equivalence test
\item the \code{pd} argument (setting \code{pd = TRUE}) in \code{model_parameters()} includes
a column with the \emph{probability of direction}, i.e. the probability that a
parameter is strictly positive or negative. See \code{\link[bayestestR:p_direction]{bayestestR::p_direction()}}
for details. If plotting is desired, the \code{\link[=p_direction.lm]{p_direction()}}
function can be used, together with \code{plot()}.
\item the \code{s_value} argument (setting \code{s_value = TRUE}) in \code{model_parameters()}
replaces the p-values with their related \emph{S}-values (\emph{Rafi and Greenland 2020})
\item finally, it is possible to generate distributions of model coefficients by
generating bootstrap-samples (setting \code{bootstrap = TRUE}) or simulating
draws from model coefficients using \code{\link[=simulate_model]{simulate_model()}}. These samples
can then be treated as "posterior samples" and used in many functions from
the \strong{bayestestR} package.
}
}
}
\note{
Curently, \code{p_function()} computes intervals based on Wald t- or z-statistic.
For certain models (like mixed models), profiled intervals may be more
accurate, however, this is currently not supported.
}
\examples{
\dontshow{if (requireNamespace("see")) (if (getRversion() >= "3.4") withAutoprint else force)(\{ # examplesIf}
model <- lm(Sepal.Length ~ Species, data = iris)
p_function(model)
model <- lm(mpg ~ wt + as.factor(gear) + am, data = mtcars)
result <- p_function(model)
# single panels
plot(result, n_columns = 2)
# integrated plot, the default
plot(result)
\dontshow{\}) # examplesIf}
}
\references{
\itemize{
\item Altman DG, Bland JM. Absence of evidence is not evidence of absence. BMJ.
1995;311(7003):485. \doi{10.1136/bmj.311.7003.485}
\item Amrhein V, Greenland S. Discuss practical importance of results based on
interval estimates and p-value functions, not only on point estimates and
null p-values. Journal of Information Technology 2022;37:316–20.
\doi{10.1177/02683962221105904}
\item Cox DR, Hinkley DV. 1979. Theoretical Statistics. 6th edition.
Chapman and Hall/CRC
\item Fraser DAS. The P-value function and statistical inference. The American
Statistician. 2019;73(sup1):135-147. \doi{10.1080/00031305.2018.1556735}
\item Gelman A, Greenland S. Are confidence intervals better termed "uncertainty
intervals"? BMJ (2019)l5381. \doi{10.1136/bmj.l5381}
\item Greenland S, Rafi Z, Matthews R, Higgs M. To Aid Scientific Inference,
Emphasize Unconditional Compatibility Descriptions of Statistics. (2022)
https://arxiv.org/abs/1909.08583v7 (Accessed November 10, 2022)
\item Greenland S, Senn SJ, Rothman KJ, Carlin JB, Poole C, Goodman SN, et al.
(2016). Statistical tests, P values, confidence intervals, and power: A
guide to misinterpretations. European Journal of Epidemiology. 31:337-350.
\doi{10.1007/s10654-016-0149-3}
\item Greenland S (2023). Divergence versus decision P-values: A distinction
worth making in theory and keeping in practice: Or, how divergence P-values
measure evidence even when decision P-values do not. Scand J Statist, 50(1),
54-88.
\item Rafi Z, Greenland S. Semantic and cognitive tools to aid statistical
science: Replace confidence and significance by compatibility and surprise.
BMC Medical Research Methodology. 2020;20(1):244. \doi{10.1186/s12874-020-01105-9}
\item Schweder T. Confidence is epistemic probability for empirical science.
Journal of Statistical Planning and Inference (2018) 195:116–125.
\doi{10.1016/j.jspi.2017.09.016}
\item Schweder T, Hjort NL. Confidence and Likelihood. Scandinavian Journal of
Statistics. 2002;29(2):309-332. \doi{10.1111/1467-9469.00285}
\item Schweder T, Hjort NL. Frequentist analogues of priors and posteriors.
In Stigum, B. (ed.), Econometrics and the Philosophy of Economics: Theory
Data Confrontation in Economics, pp. 285-217. Princeton University Press,
Princeton, NJ, 2003
\item Schweder T, Hjort NL. Confidence, Likelihood, Probability: Statistical
inference with confidence distributions. Cambridge University Press, 2016.
\item Vos P, Holbert D. Frequentist statistical inference without repeated sampling.
Synthese 200, 89 (2022). \doi{10.1007/s11229-022-03560-x}
}
}
\seealso{
See also \code{\link[=equivalence_test]{equivalence_test()}} and \code{\link[=p_significance]{p_significance()}} for
functions related to checking effect existence and significance.
}
|