1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97
|
\name{PWMSimilarity-methods}
\docType{methods}
\alias{PWMSimilarity}
\alias{PWMSimilarity-methods}
\alias{PWMSimilarity,matrix,matrix-method}
\alias{PWMSimilarity,matrix,PWMatrix-method}
\alias{PWMSimilarity,PWMatrix,matrix-method}
\alias{PWMSimilarity,PWMatrix,PWMatrix-method}
\alias{PWMSimilarity,PWMatrixList,matrix-method}
\alias{PWMSimilarity,PWMatrixList,PWMatrix-method}
\alias{PWMSimilarity,PWMatrixList,PWMatrixList-method}
\title{PWMSimilarity method}
\description{
This function measures the similarity of two PWM matrix in three measurements:
"normalised Euclidean distance", "Pearson correlation" and "Kullback Leibler divergence".
}
\usage{
PWMSimilarity(pwmSubject, pwmQuery, method=c("Euclidean", "Pearson", "KL"))
}
\arguments{
\item{pwmSubject}{
A \code{matrix} or \code{PWMatrix} or \code{PWMatrixList} object in \dQuote{prob} type.
}
\item{pwmQuery}{
A \code{matrix} or \code{PWMatrix} object.
}
\item{method}{
The method can be "Euclidean", "Pearson", "KL".
}
}
\details{
When pwmSubject and pwmQuery have different number of columns,
the smaller PWM will be shifted from the start position of larger PWM
and compare all the possible alignments.
Only the smallest distance, divergence or largest correlation will be reported.
%Given two PWMs, P^1 and P^2, where l is the length.
%P_{i,b} is the values in column i with base b.
%The normalised Euclidean distance is computed in
%D(a,b) = {1 \over {\sqrt{2}l}} \cdot \sum_{i=1}^{l} \sqrt{\sum_{b \in {\{A,C,G,T\}}} (P_{i,b}^1-P_{i,b}^2)^2}
%.
%This distance is between 0 (perfect identity) and 1 (complete dis-similarity).
%The pearson correlation coefficient is computed in
%r(P^1, P^2) = {1 \over l} \cdot \sum_{i=1}^l {\sum_{b \in \{A,C,G,T\}} (P_{i,b}^1 - 0.25)(P_{i,b}^2-0.25) \over \sqrt{\sum_{b \in \{A,C,G,T\}} (P_{i,b}^1 - 0.25)^2 \cdot \sum_{b \in \{A,C,G,T\}} (P_{i,b}^2 - 0.25)^2}}.
%The Kullback-Leibler divergence is computed in
%KL(P^1, P^2) = {1 \over {2l}} \cdot \sum_{i=1}^l \sum_{b \in \{A,C,G,T\}} (P_{i,b}^1\log{ P_{i,b}^1 \over P_{i,b}^2}+ P_{i,b}^2\log{P_{i,b}^2 \over {P_{i,b}^1}}).
}
\value{
A \code{numeric} value is returned.
}
\references{
Linhart, C., Halperin, Y., & Shamir, R. (2008). Transcription factor and microRNA motif discovery: The Amadeus platform and a compendium of metazoan target sets. Genome Research, 18(7), 1180-1189. doi:10.1101/gr.076117.108
}
\section{Methods}{
\describe{
\item{\code{signature(pwmSubject = "matrix", pwmQuery = "matrix")}}{
}
\item{\code{signature(pwmSubject = "matrix", pwmQuery = "PWMatrix")}}{
}
\item{\code{signature(pwmSubject = "PWMatrix", pwmQuery = "matrix")}}{
}
\item{\code{signature(pwmSubject = "PWMatrix", pwmQuery = "PWMatrix")}}{
}
\item{\code{signature(pwmSubject = "PWMatrixList", pwmQuery = "matrix")}}{
}
\item{\code{signature(pwmSubject = "PWMatrixList", pwmQuery = "PWMatrix")}}{
}
\item{\code{signature(pwmSubject = "PWMatrixList", pwmQuery = "PWMatrixList")}}{
}
}}
\seealso{
\code{\link{PFMSimilarity}}
}
\examples{
data(MA0003.2)
data(MA0004.1)
pwm1 = toPWM(MA0003.2, type="prob")
pwm2 = toPWM(MA0004.1, type="prob")
PWMSimilarity(pwm1, pwm2, method="Euclidean")
}
|