File: peek.Rd

package info (click to toggle)
r-cran-kutils 1.73%2Bdfsg-1
  • links: PTS, VCS
  • area: main
  • in suites: forky, sid, trixie
  • size: 1,648 kB
  • sloc: sh: 13; makefile: 2
file content (175 lines) | stat: -rw-r--r-- 7,282 bytes parent folder | download
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/peek.R
\name{peek}
\alias{peek}
\alias{histOMatic}
\title{Show variables, one at a time, QUICKLY and EASILY.}
\usage{
peek(
  dat,
  sort = TRUE,
  file = NULL,
  textout = FALSE,
  ask,
  ...,
  xlabstub = "kutils peek: ",
  freq = FALSE,
  histargs = list(probability = !freq),
  barargs = list(horiz = TRUE, las = 1)
)
}
\arguments{
\item{dat}{An R data frame or something that can be coerced to a
data frame by \code{as.data.frame}}

\item{sort}{Default TRUE. Do you want display of the columns in
alphabetical order?}

\item{file}{Should output go in file rather than to the screen.
Default is NULL, meaning show on screen. If you supply a file
name, we will write PDF output into it.}

\item{textout}{If TRUE, counts from histogram bins and tables will
appear in the console.}

\item{ask}{As in the old style R \code{par(ask = TRUE)}: should
keyboard interaction advance to the next plot.  Will default
to false if the file argument is non-null.  If file is null,
setting ask = FALSE will cause graphs to whir bye without
pausing.}

\item{...}{Additional arguments for the pdf, histogram, table, or
barplot functions. Please see Details below.}

\item{xlabstub}{A text stub that will appear in the x axis
label. Currently it includes advertising for this package.}

\item{freq}{As in the histogram frequency argument. Should graphs
show counts (freq = TRUE) or proportions (AKA densities) (freq
= FALSE)}

\item{histargs}{A list of arguments to be passed to the
\code{hist} function.}

\item{barargs}{A list of arguments to be passed to the
\code{barplot} function.}
}
\value{
A vector of column names that were plotted
}
\description{
This makes it easy to quickly scan through all of the columns in a
data frame to spot unexpected patterns or data entry errors.  Numeric variables are depicted as
histograms, while factor and character variables are summarized by
the R table function and then presented as barplots. This is most
useful with a large screen graphic device (try running the function
provided with this package, \code{dev.create(height=7, width=7)})
or any other method you prefer to create a large device.
}
\section{Try the Defaults}{
 Every effort has been made to make this
    simple and easy to use. Please run the examples as they are
    before becoming too concerned about customization.  This
    function is intended for getting a quick look at each
    variable, one-by-one, it is not intended to create publication
    quality histograms.  For sake of the fastidious users, a lot
    of settings can be adjusted. Users can control the parameters
    for presentation of histograms (parameters for \code{hist})
    and barplots (parameters for \code{barplot}). The function also
    can create frequency tables (which users can control by providing
    additional named arguments).
}

\section{Style}{
 The histograms are standard, upright histograms.
    The barplots are horizontal. I chose to make the bars
    horizontal because long value labels are more easily
    accomodated on the left axis.  The code measures the length
    (in inches) for strings and the margin is increased
    accordingly.  The examples have a demonstration of that
    effect.
}

\section{Dealing with Dots}{
 additional named arguments,
    \code{...}, are inspected and sorted into groups intended to
    control use of R functions \code{hist}, \code{barplot},
    \code{table} and \code{pdf}.  \cr \cr The parameters
    c("exclude", "dnn", "useNA", "deparse.level") and will go to
    the \code{table} function, which is used to make barplots for
    factor and character variables. These named arguments are
    extracted and sent to the pdf function: c("width", "height",
    "onefile", "family", "title", "fonts", "version", "paper",
    "encoding", "bg", "fg", "pointsize", "pagecentre",
    "colormodel", "useDingbats", "useKerning", "fillOddEven",
    "compress"). Any other arguments that are unique to
    \code{hist} or \code{barplot} are sorted out and sent only to
    those functions.  \cr \cr Any other arguments, including
    graphical parameters will be sent to both the histogram and
    barplot functions, so it is a convenient way to obtain uniform
    appearance. Additional arguments that are common to
    \code{barplot} and \code{hist} will work, and so will any
    graphics parameters (named arguments of \code{par}, for
    example). However, if one wants to target some arguments to
    \code{hist}, but not \code{barplot}, then the \code{histargs}
    list argument should be used. Similarly, \code{barargs} should
    be used to send argument to the \code{barplot}
    function. Warning: the defaults for \code{histargs} and
    \code{barargs} include some settings that are needed for the
    existing design.  If new lists for \code{histargs} or
    \code{barargs} are supplied, the previously specified defaults
    are lost.  Hence, users should include the existing members of
    those lists, possibly with revised values.  \cr \cr All of
    this argument sorting effort is done in order to reduce a
    prolific number of warnings that were observed in previous
    editions of this function.
}

\examples{
\donttest{
set.seed(234234)
N <- 200
mydf <- data.frame(x5 = rnorm(N), x4 = rnorm(N), x3 = rnorm(N),
                   x2 = letters[sample(1:24, 200, replace = TRUE)],
                   x1 = factor(sample(c("cindy", "bobby", "marsha",
                                        "greg", "chris"), 200, replace = TRUE)),
                   stringsAsFactors = FALSE)
## Insert 16 missings
mydf$x1[sample(1:150, 16,)] <- NA
mydf$adate <- as.Date(c("1jan1960", "2jan1960", "31mar1960", "30jul1960"), format = "\%d\%b\%y")
peek(mydf)
peek(mydf, sort = FALSE)
## Demonstrate the dot-dot-dot usage to pass in hist params
peek(mydf, breaks = 30, ylab = "These are Counts, not Densities", freq = TRUE)
## Not Run: file output
## peek(mydf, sort = FALSE, file = "three_histograms.pdf")
## Use some objects from the datasets package
library(datasets)
peek(cars, xlabstub = "R cars data: ")
peek(EuStockMarkets, xlabstub = "Euro Market Data: ")
peek(EuStockMarkets, xlabstub = "Euro Market Data: ", breaks = 50,
     freq = TRUE)
## Not run: file output
## peek(EuStockMarkets, breaks = 50, file = "myeuro.pdf",
##      height = 4, width=3, family = "Times")
## peek(EuStockMarkets, breaks = 50, file = "myeuro-\%d3.pdf",
##      onefile = FALSE, family = "Times", textout = TRUE)
## xlab goes into "..." and affects both histograms and barplots
peek(mydf, breaks = 30, ylab = "These are Counts, not Densities",
    freq = TRUE)
## xlab is added in the barargs list.
peek(mydf, breaks = 30, ylab = "These are Counts, not Densities",
    freq = TRUE, barargs = list(horiz = TRUE, las = 1, xlab = "I'm in barargs"))
peek(mydf, breaks = 30, ylab = "These are Counts, not Densities", freq = TRUE,
     barargs = list(horiz = TRUE, las = 1, xlim = c(0, 100),
     xlab = "I'm in barargs, not in histargs"))
levels(mydf$x1) <- c(levels(mydf$x1), "arthur philpot smythe")
mydf$x1[4] <- "arthur philpot smythe"
mydf$x2[1] <- "I forgot what letter"
peek(mydf, breaks = 30,
     barargs = list(horiz = TRUE, las = 1))
}
}
\author{
Paul Johnson <pauljohn@ku.edu>
}