1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90
|
\name{plot.info}
\alias{plot.info}
\title{Plot the proportion of missing genotype information}
\description{
Plot a measure of the proportion of missing information in the
genotype data.
}
\usage{
plot.info(x, chr, method=c("both","entropy","variance"), step=1,
off.end=0, error.prob=0.001,
map.function=c("haldane","kosambi","c-f","morgan"),
alternate.chrid=FALSE, \dots)
}
\arguments{
\item{x}{An object of class \code{cross}. See
\code{\link[qtl]{read.cross}} for details.}
\item{chr}{Vector specifying the chromosomes to plot.}
\item{method}{Indicates whether to plot the entropy version of the
information, the variance version, or both.}
\item{step}{Maximum distance (in cM) between positions at which the
missing information is calculated, though for \code{step=0},
it is are calculated only at the marker locations.}
\item{off.end}{Distance (in cM) past the terminal markers on each
chromosome to which the genotype probability calculations will be
carried.}
\item{error.prob}{Assumed genotyping error rate used in the calculation
of the penetrance Pr(observed genotype | true genotype).}
\item{map.function}{Indicates whether to use the Haldane, Kosambi or
Carter-Falconer map function when converting genetic distances into
recombination fractions.}
\item{alternate.chrid}{If TRUE and more than one chromosome is
plotted, alternate the placement of chromosome
axis labels, so that they may be more easily distinguished.}
\item{\dots}{Passed to \code{\link[qtl]{plot.scanone}}.}
}
\details{
The entropy version of the missing information: for a single
individual at a single genomic position, we measure the missing
information as \eqn{H = \sum_g p_g \log p_g / \log n}{H = sum p[g] log
p[g] / log n}, where \eqn{p_g}{p[g]} is the probability of the
genotype \eqn{g}, and \eqn{n} is the number of possible genotypes,
defining \eqn{0 \log 0 = 0}{0 log 0 = 0}. This takes values between 0
and 1, assuming the value 1 when the genotypes (given the marker data)
are equally likely and 0 when the genotypes are completely determined.
We calculate the missing information at a particular position as the
average of \eqn{H} across individuals. For an intercross, we don't
scale by \eqn{\log n} but by the entropy in the case of genotype
probabilities (1/4, 1/2, 1/4).
The variance version of the missing information: we calculate the
average, across individuals, of the variance of the genotype
distribution (conditional on the observed marker data) at a particular
locus, and scale by the maximum such variance.
Calculations are done in C (for the sake of speed in the presence of
little thought about programming efficiency) and the plot is created
by a call to \code{\link[qtl]{plot.scanone}}.
Note that \code{\link[qtl]{summary.scanone}} may be used to display
the maximum missing information on each chromosome.
}
\value{
An object with class \code{scanone}: a data.frame with columns the
chromosome IDs and cM positions followed by the entropy and/or
variance version of the missing information.
}
\examples{
data(hyper)
\dontshow{hyper <- subset(hyper,chr=1:4)}
plot.info(hyper,chr=c(1,4))
# save the results and view maximum missing info on each chr
info <- plot.info(hyper)
summary(info)
}
\seealso{ \code{\link[qtl]{plot.scanone}},
\code{\link[qtl]{plot.missing}} }
\author{Karl W Broman, \email{kbroman@biostat.wisc.edu} }
\keyword{hplot}
\keyword{univar}
|