1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68
|
% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/data_seek.R
\name{data_seek}
\alias{data_seek}
\title{Find variables by their names, variable or value labels}
\usage{
data_seek(data, pattern, seek = c("names", "labels"), fuzzy = FALSE)
}
\arguments{
\item{data}{A data frame.}
\item{pattern}{Character string (regular expression) to be matched in \code{data}.
May also be a character vector of length > 1. \code{pattern} is searched for in
column names, variable label and value labels attributes, or factor levels of
variables in \code{data}.}
\item{seek}{Character vector, indicating where \code{pattern} is sought. Use one
or more of the following options:
\itemize{
\item \code{"names"}: Searches in column names. \code{"column_names"} and \code{"columns"} are
aliases for \code{"names"}.
\item \code{"labels"}: Searches in variable labels. Only applies when a \code{label} attribute
is set for a variable.
\item \code{"values"}: Searches in value labels or factor levels. Only applies when a
\code{labels} attribute is set for a variable, or if a variable is a factor.
\code{"levels"} is an alias for \code{"values"}.
\item \code{"all"}: Searches in all of the above.
}}
\item{fuzzy}{Logical. If \code{TRUE}, "fuzzy matching" (partial and close distance
matching) will be used to find \code{pattern}.}
}
\value{
A data frame with three columns: the column index, the column name
and - if available - the variable label of all matched variables in \code{data}.
}
\description{
This functions seeks variables in a data frame, based on patterns
that either match the variable name (column name), variable labels, value labels
or factor levels. Matching variable and value labels only works for "labelled"
data, i.e. when the variables either have a \code{label} attribute or \code{labels}
attribute.
\code{data_seek()} is particular useful for larger data frames with labelled
data - finding the correct variable name can be a challenge. This function
helps to find the required variables, when only certain patterns of variable
names or labels are known.
}
\examples{
# seek variables with "Length" in variable name or labels
data_seek(iris, "Length")
# seek variables with "dependency" in names or labels
# column "e42dep" has a label-attribute "elder's dependency"
data(efc)
data_seek(efc, "dependency")
# "female" only appears as value label attribute - default search is in
# variable names and labels only, so no match
data_seek(efc, "female")
# when we seek in all sources, we find the variable "e16sex"
data_seek(efc, "female", seek = "all")
# typo, no match
data_seek(iris, "Lenght")
# typo, fuzzy match
data_seek(iris, "Lenght", fuzzy = TRUE)
}
|