File: validate_prediction_size.Rd

package info (click to toggle)
r-cran-hardhat 1.2.0%2Bdfsg-1
  • links: PTS, VCS
  • area: main
  • in suites: bookworm
  • size: 1,656 kB
  • sloc: sh: 13; makefile: 2
file content (95 lines) | stat: -rw-r--r-- 3,239 bytes parent folder | download
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/validation.R
\name{validate_prediction_size}
\alias{validate_prediction_size}
\alias{check_prediction_size}
\title{Ensure that predictions have the correct number of rows}
\usage{
validate_prediction_size(pred, new_data)

check_prediction_size(pred, new_data)
}
\arguments{
\item{pred}{A tibble. The predictions to return from any prediction
\code{type}. This is often created using one of the spruce functions, like
\code{\link[=spruce_numeric]{spruce_numeric()}}.}

\item{new_data}{A data frame of new predictors and possibly outcomes.}
}
\value{
\code{validate_prediction_size()} returns \code{pred} invisibly.

\code{check_prediction_size()} returns a named list of three components,
\code{ok}, \code{size_new_data}, and \code{size_pred}.
}
\description{
validate - asserts the following:
\itemize{
\item The size of \code{pred} must be the same as the size of \code{new_data}.
}

check - returns the following:
\itemize{
\item \code{ok} A logical. Does the check pass?
\item \code{size_new_data} A single numeric. The size of \code{new_data}.
\item \code{size_pred} A single numeric. The size of \code{pred}.
}
}
\details{
This validation function is one that is more developer focused rather than
user focused. It is a final check to be used right before a value is
returned from your specific \code{predict()} method, and is mainly a "good
practice" sanity check to ensure that your prediction blueprint always returns
the same number of rows as \code{new_data}, which is one of the modeling
conventions this package tries to promote.
}
\section{Validation}{


hardhat provides validation functions at two levels.
\itemize{
\item \verb{check_*()}:  \emph{check a condition, and return a list}. The list
always contains at least one element, \code{ok}, a logical that specifies if the
check passed. Each check also has check specific elements in the returned
list that can be used to construct meaningful error messages.
\item \verb{validate_*()}: \emph{check a condition, and error if it does not pass}. These
functions call their corresponding check function, and
then provide a default error message. If you, as a developer, want a
different error message, then call the \verb{check_*()} function yourself,
and provide your own validation function.
}
}

\examples{
# Say new_data has 5 rows
new_data <- mtcars[1:5, ]

# And somehow you generate predictions
# for those 5 rows
pred_vec <- 1:5

# Then you use `spruce_numeric()` to clean
# up these numeric predictions
pred <- spruce_numeric(pred_vec)

pred

# Use this check to ensure that
# the number of rows or pred match new_data
check_prediction_size(pred, new_data)

# An informative error message is thrown
# if the rows are different
try(validate_prediction_size(spruce_numeric(1:4), new_data))
}
\seealso{
Other validation functions: 
\code{\link{validate_column_names}()},
\code{\link{validate_no_formula_duplication}()},
\code{\link{validate_outcomes_are_binary}()},
\code{\link{validate_outcomes_are_factors}()},
\code{\link{validate_outcomes_are_numeric}()},
\code{\link{validate_outcomes_are_univariate}()},
\code{\link{validate_predictors_are_numeric}()}
}
\concept{validation functions}