File: QTLtools-fenrich.1

package info (click to toggle)
qtltools 1.3.1%2Bdfsg-2
  • links: PTS, VCS
  • area: main
  • in suites: bullseye
  • size: 14,996 kB
  • sloc: cpp: 15,321; makefile: 243; sh: 63; ansic: 51
file content (225 lines) | stat: -rw-r--r-- 6,386 bytes parent folder | download | duplicates (2)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
.\" Manpage for QTLtools fenrich.
.\" Contact halitongen@gmail.com to correct errors or typos.
.TH QTLtools-fenrich 1 "06 May 2020" "QTLtools-v1.3" "Bioinformatics tools"
.SH NAME
QTLtools fenrich \- Functional enrichment of molecular QTLs
.SH SYNOPSIS
.B QTLtools fenrich  \-\-qtl
.IR significanty_genes.bed
.B \-\-tss
.IR gene_tss.bed
.B \-\-bed
.IR TFs.encode.bed.gz
.B \-\-out
.IR output.txt
.I [OPTIONS]
.SH DESCRIPTION
This mode allows assessing whether a set of QTLs fall within some functional annotations more often than what is expected by chance.
The method is detailed in <https://www.nature.com/articles/ncomms15452>.
Here, we mean by chance is what is expected given the non-uniform distributions of molQTLs and functional annotations around the genomic positions of the molecular phenotypes.
To do so, we first enumerate all the functional annotations located nearby a given molecular phenotype.
In practice, for X phenotypes being quantified, we have X lists of annotations.
And, for the subset Y of those having a significant molQTL, we count how often the Y molQTLs overlap the annotations in the corresponding lists: this gives the observed overlap frequency fobs(Y) between molQTLs and functional annotations.
Then, we permute the lists of functional annotations across the phenotypes (e.g, phenotype A may be assigned the list of annotations coming from phenotype B) and for each permuted data set, we count how often the Y molQTLs do overlap the newly assigned functional annotations: this gives the expected overlap frequency fexp(Y) between molQTLs and functional annotations.
By doing this permutation scheme, we keep the distribution of functional annotations and molQTLs around molecular phenotypes unchanged.
Now that we have the observed and expected overlap frequencies, we use a fisher test to assess how fobs(Y) and fexp(Y) differ.
This gives an odd ratio estimate and a two-sided p-value which basically tells us first if there is and enrichment or depletion, and second how significant this is.

.SH OPTIONS
.TP
.B \-\-qtl \fIin.bed\fB
List of QTLs of interest in BED format.
REQUIRED.
.TP
.B \-\-bed \fIfunctional_annotation.bed.gz\fR
Functional annotations in BED format.
REQUIRED.
.TP
.B \-\-tss \fIgenes.bed\fR
List of positions of all phenotypes you mapped QTLs for, in BED format.
REQUIRED.
.TP
.B \-\-out \fIoutput.txt\fR
Output file.
REQUIRED.
.TP
.B \-\-permute \fIinteger\fR
Number of permutation to run.
DEFAULT=1000

.SH INPUT FILES
.TP 1
.B \-\-qtl file
List of QTLs of interest.
An example:
.sp 1
1	15210	15211	1_15211	ENSG00000227232.4	-
.sp 0
1	735984	735985	1_735985	ENSG00000177757.1	+
.sp 0
1	735984	735985	1_735985	ENSG00000240453.1	-
.sp 0
1	739527	739528	1_739528	ENSG00000237491.4	+
.sp 1
The column definitions are:
.TS
n lx .
1	T{
The variant chromosome
T}
2	T{
The variant's start position (\fB0-based\fR)
T}
3	T{
The variant's end position (\fB1-based\fR)
T}
4	T{
The variant ID
T}
5	T{
The phenotype ID
T}
6	T{
The phenotype's strand. (\fBnot used\fR)
T}
.TE
.ta T 5
.TP 1
.B \-\-bed file
List of annotations in BED format.
An example:
.sp 1
1	254874	265487
.sp 0
1	730984	735985
.sp 0
1	734984	736585
.sp 0
1	739527	748528
.sp 1
The column definitions are:
.TS
n lx .
1	T{
Chromosome
T}
2	T{
Start position (\fB0-based\fR)
T}
3	T{
End position (\fB1-based\fR)
T}
.TE
.ta T 5
.TP 1
.B \-\-tss file
List of positions of all phenotypes you mapped QTLs for.
An example:
.sp 1
1	29369	29370	ENSG00000227232.4	1_15211	-
.sp 0
1	135894	135895	ENSG00000268903.1	1_985446	-
.sp 0
1	137964	137965	ENSG00000269981.1	1_1118728	-
.sp 0
1	317719	317720	ENSG00000237094.7	1_15211	+
.sp 1
The column definitions are:
.TS
n lx .
1	T{
Phenotype's chromosome
T}
2	T{
The start position of the phenotype (\fB0-based\fR)
T}
3	T{
The end position of the phenotype (\fB1-based\fR)
T}
4	T{
The phenotype ID
T}
5	T{
Top variant (\fBnot used\fR)
T}
6	T{
The phenotype's strand
T}
.TE

.SH OUTPUT FILE
.TP 1
.B \fB\-\-out\fR file
Space separated results output file detailing the enrichment with the following columns:
.TS
n lx .
1	T{
The observed number of QTLs falling within the functional annotations
T}
2	T{
The total number of QTLs
T}
3	T{
The mean expected number of QTLs falling within the functional annotations (across multiple permutations)
T}
4	T{
The standard deviation of the expected number of QTLs falling within the functional annotations (across multiple permutations)
T}
5	T{
The empirical p-value
T}
6	T{
Lower bound of the 95% confidence interval of the odds ratio
T}
7	T{
The odds ratio
T}
8	T{
Upper bound of the 95% confidence interval of the odds ratio
T}
.TE

.SH EXAMPLE
.IP 1 2
You need to prepare a BED file containing the positions of the QTLs of interest.
To do so, extract all significant hits at a given FDR threshold (e.g. 5%), and then transform the significant QTL list into a BED file:
.IP "" 2
Rscript ./script/qtltools_runFDR_cis.R results.genes.full.txt.gz 0.05 results.genes
.sp 0
cat results.genes.significant.txt | awk '{ print $9, $10-1, $11, $8, $1, $5 }' | tr ' ' '\\t' | sort \-k1,1V \-k2,2g > results.genes.significant.bed
.IP 2 2
Prepare a BED file containing the positions of all phenotypes you mapped QTLs for:
.IP "" 2
zcat results.genes.full.txt.gz | awk '{ print $2, $3-1, $4, $1, $8, $5 }' | tr ' ' '\\t' | sort \-k1,1V \-k2,2g > results.genes.quantified.bed
.IP 3 2
Run the enrichment analysis:
.IP "" 2
QTLtools fenrich \-\-qtl results.genes.significant.bed \-\-tss results.genes.quantified.bed \-\-bed TFs.encode.bed.gz \-\-out enrichment.QTL.in.TF.txt

.SH SEE ALSO
.IR QTLtools (1)
.\".IR QTLtools-bamstat (1),
.\".IR QTLtools-mbv (1),
.\".IR QTLtools-pca (1),
.\".IR QTLtools-correct (1),
.\".IR QTLtools-cis (1),
.\".IR QTLtools-trans (1),
.\".IR QTLtools-fenrich (1),
.\".IR QTLtools-fdensity (1),
.\".IR QTLtools-rtc (1),
.\".IR QTLtools-rtc-union (1),
.\".IR QTLtools-extract (1),
.\".IR QTLtools-quan (1),
.\".IR QTLtools-rep (1),
.\".IR QTLtools-gwas (1),
.PP
QTLtools website: <https://qtltools.github.io/qtltools>
.SH BUGS
.IP o 2
Please submit bugs to <https://github.com/qtltools/qtltools>
.SH
CITATION
Delaneau, O., Ongen, H., Brown, A. et al. A complete tool set for molecular QTL discovery and analysis. \fINat Commun\fR \fB8\fR, 15452 (2017).
<https://doi.org/10.1038/ncomms15452>
.SH AUTHORS
Olivier Delaneau (olivier.delaneau@gmail.com), Halit Ongen (halitongen@gmail.com)