File: mic_strength.Rd

package info (click to toggle)
r-cran-minerva 1.5.8-2
  • links: PTS, VCS
  • area: main
  • in suites: bullseye, sid
  • size: 1,460 kB
  • sloc: ansic: 1,112; cpp: 271; sh: 14; makefile: 2
file content (105 lines) | stat: -rw-r--r-- 4,813 bytes parent folder | download
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/mictools.R
\name{mic_strength}
\alias{mic_strength}
\title{Compute the association strengh}
\usage{
mic_strength(x, pval, alpha = NULL, C = 5, pthr = 0.05,
  pval.col = NULL)
}
\arguments{
\item{x}{a numeric matrix with N samples on the rows and M variables on the columns (NxM).}

\item{pval}{a data.frame with pvalues for each pair of association of the \code{x} input matrix. It should contain two colums with 
the indices of the computed association according to the x input matrix}

\item{alpha}{float (0, 1.0] or >=4 if alpha is in (0,1] then B will be max(n^alpha, 4) where n is the
number of samples. If alpha is >=4 then alpha defines directly the B
parameter. If alpha is higher than the number of samples (n) it will be
limited to be n, so B = min(alpha, n) Default value is 0.6 (see Details).}

\item{C}{a positive integer number, the \code{C} parameter of the \code{mine} statistic. 
See \code{\link[minerva]{mine}} function for further details.}

\item{pthr}{threshold on pvalue for measure to consider for computing mic_e}

\item{pval.col}{an integer or character or vector relative to the columns of \code{pval} dataframe respectively for \code{pvalue}, 
association between variable 1, variable 2 in the \code{x} input matrix. See Details for further information.}
}
\value{
A dataframe with the \code{tic_e} Pvalue, the \code{mic} value and the column identifier regarding the input matrix
\code{x} of the variables of which the association is computed.
}
\description{
This function uses the null distribution of the \code{tic_e} computed with the function \code{\link[minerva]{mictools}}. 
Based on the available pvalue and the permutation null distribution it identifies reliable association between variables.
}
\details{
The method implemented here is a wrapper for the original method published by Albaese et al. (2018). The python version
is available at \url{https://github.com/minepy/mictools}.

This function should be called after the estimation of the null distribution of \code{tic_e} scores based on permutations of the input data.

The \code{mic} association is computed only for the variables for which the pvalue in the \code{pval} \code{data.frame} is less then 
the threshold set with the \code{pthr} input parameter. 
We assume the first column of the \code{pval} \code{data.frame} contains the pvalue, this value can be changed using 
the \code{pval.col}[1] parameter. 

The \code{pval.col} parameter, by default takes the first three columns in the \code{pval} \code{data.frame}, in particular the first column containing the \code{pvalues} 
of the association between variable in column \code{pval.col[2]} and \code{pval.col[3]}.
If a character vector is provided names in \code{pval.col} are matched with the names in \code{pval} \code{data.frame}.
If \code{NULL} is passed it is assumed the first column contains pvalue, while the 2 and 3 the index or name of the variable in \code{x}.
If one value is passed it refers to the \code{pvalue} column and the consecutive two columns are assume to contain variable indexes.
}
\examples{
data(Spellman)
mydata <- as.matrix(Spellman[, 10:20])
ticenull <- mictools(mydata, nperm=1000)

## Use the nominal pvalue:
ms <- mic_strength(mydata, pval=ticenull$pval, alpha=NULL, pval.col = c(6, 4,5))

## Use the adjusted pvalue:
ms <- mic_strength(mydata, pval=ticenull$pval, alpha=NULL, pval.col = c(6, 4,5))

ms 

\dontrun{
## Use qvalue
require(qvalue)
qobj <- qvalue(ticenull$pval$pval)
ticenull$pval$qvalue <- qobj$qvalue
ms <- mic_strength(mydata, pval=ticenull$pval, alpha=NULL, pval.col = c("qvalue", "Var1", "Var2"))

## Get the data from mictools repository

lnf <- "https://raw.githubusercontent.com/minepy/mictools/master/examples/datasaurus.txt"
datasaurus <- read.table(lnf, header=TRUE, row.names = 1, stringsAsFactors = FALSE)
datasaurus <- t(datasaurus)
ticenull <- mictools(datasaurus, nperm=200000)
micres <- mic_strength(mydata, ticenull$pval, pval.col=c(6, 4, 5))

## Plot distribution of pvalues
hist(ticenull$pval, breaks=50, freq=FALSE)

## Plot distribution of tic_e values
hist(ticenull$tic)

## Correct pvalues using qvalue package
require(qvalue)
require(ggplot2)
qobj <- qvalue(ticenull$pval$pval)
ticenull$pval$qvalue <- qobj$qvalue
micres <- mic_strength(datasaurus, ticenull$pval, pval.col=c("qvalue", "Var1", "Var2"))

hist(qobj$qvalue)

df <- data.frame(pi0.labmda=qobj$pi0.lambda, lambda=qobj$lambda, pi0.smooth=qobj$pi0.smooth)
gp0 <- ggplot(df, aes(lambda, pi0.labmda)) + geom_point() 
gp0 <- gp0 + geom_line(aes(lambda, pi0.smooth))
gp0 <- gp0 + geom_hline(yintercept = qobj$pi0, linetype="dashed", col="red")
}
}
\seealso{
\code{\link[minerva]{mine}}, \code{\link[minerva]{mictools}}, \code{\link[stats]{p.adjust}}
}