File: intersectClonesets.Rd

package info (click to toggle)
r-cran-tcr 2.3.2%2Bds-1
  • links: PTS, VCS
  • area: main
  • in suites: bookworm, bullseye, trixie
  • size: 2,316 kB
  • sloc: cpp: 187; makefile: 5
file content (105 lines) | stat: -rw-r--r-- 5,463 bytes parent folder | download | duplicates (2)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/crosses.R
\name{intersectClonesets}
\alias{intersectClonesets}
\alias{intersectCount}
\alias{intersectLogic}
\alias{intersectIndices}
\title{Intersection between sets of sequences or any elements.}
\usage{
intersectClonesets(.alpha = NULL, .beta = NULL, .type = "n0e", .head = -1, .norm = F,
          .verbose = F)

intersectCount(.alpha, .beta, .method = c('exact', 'hamm', 'lev'), .col = NULL)

intersectIndices(.alpha, .beta, .method = c('exact', 'hamm', 'lev'), .col = NULL)

intersectLogic(.alpha, .beta, .method = c('exact', 'hamm', 'lev'), .col = NULL)
}
\arguments{
\item{.alpha}{Either first vector or data.frame or list with data.frames.}

\item{.beta}{Second vector or data.frame or type of intersection procedure (see the \code{.type} parameter) if \code{.alpha} is a list.}

\item{.type}{Types of intersection procedure if \code{.alpha} and \code{.beta} is data frames. String with 3 characters (see 'Details' for more information).}

\item{.head}{Parameter for the \code{head} function, applied before intersecting.}

\item{.norm}{If TRUE than normalise result by product of length or nrows of the given data.}

\item{.verbose}{if T then produce output of processing the data.}

\item{.method}{Method to use for intersecting string elements: 'exact' for exact matching, 'hamm' for matching strings which have <= 1 hamming distance,
'lev' for matching strings which have <= 1 levenshtein (edit) distance between them.}

\item{.col}{Which columns use for fetching values to intersect. First supplied column matched with \code{.method}, others as exact values.}
}
\value{
\code{intersectClonesets} returns (normalised) number of similar elements or matrix with numbers of elements.

\code{intersectCount} returns number of similar elements.

\code{intersectIndices} returns 2-row matrix with the first column stands for an index of an element in the given \code{x}, and the second column stands for an index of an element of \code{y} which is similar to a relative element in \code{x}; 

\code{intersectLogic} returns logical vector of \code{length(x)} or \code{nrow(x)}, where TRUE at position \code{i} means that element with index {i} has been found in the \code{y}
}
\description{
Functions for the intersection of data frames with TCR / Ig data. 
See the \code{repOverlap} function for a general interface to all overlap analysis functions.

\code{intersectClonesets} - returns number of similar elements in the given two clonesets / data frames or matrix
with counts of similar elements among each pair of objects in the given list.

\code{intersectCount} - similar to \code{tcR::intersectClonesets}, but with fewer parameters and only for two objects.

\code{intersectIndices} - returns matrix M with two columns, where element with index M[i, 1] in the first
given object is similar to an element with index M[i, 2] in the second given object.

\code{intersectLogic} - returns logic vector with TRUE values in positions, where element in the first given data frame
is found in the second given data frame.
}
\details{
Parameter \code{.type} of the \code{intersectClonesets} function is a string of length 3
[0an][0vja][ehl], where:
\enumerate{
 \item First character defines which elements intersect ("a" for elements from the column "CDR3.amino.acid.sequence", 
 "n" for elements from the column "CDR3.nucleotide.sequence", other characters - intersect elements as specified);
 \item Second character defines which columns additionaly script should use
('0' for cross with no additional columns, 'v' for cross using the "V.gene" column, 
'j' for cross using "J.gene" column, 'a' for cross using both "V.gene" and "J.gene" columns);
 \item Third character defines a method of search for similar sequences is use:
 "e" stands for the exact match of sequnces, "h" for match elements which have the Hamming distance between them
 equal to or less than 1, "l" for match elements which have the Levenshtein distance between tham equal to or less than 1.
}
}
\examples{
\dontrun{
data(twb)
# Equivalent to intersectClonesets(twb[[1]]$CDR3.nucleotide.sequence,
#                         twb[[2]]$CDR3.nucleotide.sequence)
# or intersectCount(twb[[1]]$CDR3.nucleotide.sequence,
#                    twb[[2]]$CDR3.nucleotide.sequence)
# First "n" stands for a "CDR3.nucleotide.sequence" column, "e" for exact match.
twb.12.n0e <- intersectClonesets(twb[[1]], twb[[2]], 'n0e')
stopifnot(twb.12.n0e == 46)
# First "a" stands for "CDR3.amino.acid.sequence" column.
# Second "v" means that intersect should also use the "V.gene" column.
intersectClonesets(twb[[1]], twb[[2]], 'ave')
# Works also on lists, performs all possible pairwise intersections.
intersectClonesets(twb, 'ave')
# Plot results.
vis.heatmap(intersectClonesets(twb, 'ave'), .title = 'twb - (ave)-intersection', .labs = '')
# Get elements which are in both twb[[1]] and twb[[2]].
# Elements are tuples of CDR3 nucleotide sequence and corresponding V-segment
imm.1.2 <- intersectLogic(twb[[1]], twb[[2]],
                           .col = c('CDR3.amino.acid.sequence', 'V.gene'))  
head(twb[[1]][imm.1.2, c('CDR3.amino.acid.sequence', 'V.gene')])
data(twb)
ov <- repOverlap(twb)
sb <- matrixSubgroups(ov, list(tw1 = c('Subj.A', 'Subj.B'), tw2 = c('Subj.C', 'Subj.D')));
vis.group.boxplot(sb)
}
}
\seealso{
\link{repOverlap}, \link{vis.heatmap}, \link{ozScore}, \link{permutDistTest}, \link{vis.group.boxplot}
}