1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105
|
% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/trim.R
\name{trim}
\alias{trim}
\alias{trimPrint,}
\alias{trimStr,}
\alias{trimChr,}
\alias{trimDeparse,}
\alias{trimFile}
\alias{trimPrint}
\alias{trimPrint,ANY,character-method}
\alias{trimStr}
\alias{trimStr,ANY,character-method}
\alias{trimChr}
\alias{trimChr,ANY,character-method}
\alias{trimDeparse}
\alias{trimDeparse,ANY,character-method}
\alias{trimFile,ANY,character-method}
\title{Methods to Remove Unsemantic Text Prior to Diff}
\usage{
trimPrint(obj, obj.as.chr)
\S4method{trimPrint}{ANY,character}(obj, obj.as.chr)
trimStr(obj, obj.as.chr)
\S4method{trimStr}{ANY,character}(obj, obj.as.chr)
trimChr(obj, obj.as.chr)
\S4method{trimChr}{ANY,character}(obj, obj.as.chr)
trimDeparse(obj, obj.as.chr)
\S4method{trimDeparse}{ANY,character}(obj, obj.as.chr)
trimFile(obj, obj.as.chr)
\S4method{trimFile}{ANY,character}(obj, obj.as.chr)
}
\arguments{
\item{obj}{the object}
\item{obj.as.chr}{character the \code{print}ed representation of the object}
}
\value{
a \code{length(obj.as.chr)} row and 2 column integer matrix with the
start (first column) and end (second column) character positions of the sub
string to run diffs on.
}
\description{
\code{\link[=diffPrint]{diff*}} methods, in particular \code{diffPrint},
modify the text representation of an object prior to running the diff to
reduce the incidence of spurious mismatches caused by unsemantic differences.
For example, we look to remove matrix row indices and atomic vector indices
(i.e. the \samp{[1,]} or \samp{[1]} strings at the beginning of each display
line).
}
\details{
Consider: \preformatted{
> matrix(10:12)
[,1]
[1,] 10
[2,] 11
[3,] 12
> matrix(11:12)
[,1]
[1,] 11
[2,] 12
}
In this case, the line by line diff would find all rows of the matrix to
be mismatched because where the data matches (rows containing
11 and 12) the indices do not. By trimming out the row indices before
the diff, the diff can recognize that row 2 and 3 from the first matrix
should be matched to row 1 and 2 of the second.
These methods follow a similar interface as the \code{\link[=guides]{guide*}}
methods, with one available for each \code{diff*} method except for
\code{diffCsv} since that one uses \code{diffPrint} internally. The
unsemantic differences are added back after the diff for display purposes,
and are colored in grey to indicate they are ignored in the diff.
Currently only \code{trimPrint} and \code{trimStr} do anything meaningful.
\code{trimPrint} removes row index headers provided that they are of the
default un-named variety. If you add row names, or if numeric row indices
are not ascending from 1, they will not be stripped as those have meaning.
\code{trimStr} removes the \samp{..$}, \samp{..-}, and \samp{..@} tokens
to minimize spurious matches.
You can modify how text is trimmed by providing your own functions to the
\code{trim} argument of the \code{diff*} methods, or by defining
\code{trim*} methods for your objects. Note that the return value for these
functions is the start and end columns of the text that should be
\emph{kept} and used in the diff.
As with guides, trimming is on a best efforts basis and may fail with
\dQuote{pathological} display representations. Since the diff still works
even with failed trimming this is considered an acceptable compromise.
Trimming is more likely to fail with nested recursive structures.
}
\note{
\code{obj.as.chr} will be as processed by
\code{\link{strip_hz_control}} and as such will not be identical to the
captured output if it contains tabs, newlines, or carriage returns.
}
|