File: ChangeLog

package info (click to toggle)
probabel 0.5.0%2Bdfsg-3
  • links: PTS, VCS
  • area: main
  • in suites: buster
  • size: 12,680 kB
  • sloc: cpp: 6,357; sh: 2,355; ansic: 1,676; makefile: 934; asm: 284; perl: 263; awk: 47
file content (493 lines) | stat: -rw-r--r-- 21,042 bytes parent folder | download | duplicates (3)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
***** v.0.5.0 (2016.05.04)
* ProbABEL now depends on the Eigen library. The option to compile without
  Eigen has been removed. Eigen 3.2.1 is included with ProbABEL. The
  directory src/eigen-3.2.1 contains header files we need. The full Eigen
  source code tar-ball is included as src/3.2.1.tar.bz2.
* Fixed Issue #15: Coxph interactions not working; when using the
  --interaction option the name of the model was wrong (e.g. for
  --interaction=1, the interaction term was shown as mu * SNP_A1, which
  obviously is not what we meant to do).
* Fixed Github issue #16: "Too many NANs for SNPs with low `Rsq` when some
  commandline options are specified"
* Added --flipmaf command line option to palinear, pacoxph and palogist.
  If this option is selected, ProbABEL will 'flip' the alleles A1 and A2
  for those variants for which the Freq1 value in the info file is less
  than 0.5. Changing the reference and effect allele in this way can
  sometimes (for some genetic models and low MAF), reduce regression
  errors and thus result in proper estimates which otherwise be set to
  NaN. Many thanks for the group of Prof. Kiemeney at the dept. of Health
  Evidence Radboud UMC in Nijmegen (NL) for sponsoring the work on this
  feature! Also thanks to Tessel Galesloot from the same group for
  extensive testing of this release. (Pull Request #24, also fixes bug #5).
* Don't set the betas and se_betas for all covariates in case Cox
  regression problems like singularities occur, but only the beta/se_beta
  of the corresponding covariate. Also make the corresponding warning
  messages more clear by including the name of the genetic model that is
  being tested.
* Not-a-number values in the output are now printed as "nan" instead of
  "NaN".
* Linear regression mmscore option is 2 times faster
* Speedup of Linear regression: (measured with multiple runs with 2 covariates,
  33815 SNP and 3485 people (using all) compared to v.0.4.3)
** Reading mldose/mlprob files 14 times faster
** Calculation of regression is more than 3.5 faster
** Overall runtime using above settings is 5 times faster
* Handles multiple variants of NaN (NA,Na,Nan,na,nan) correct while reading of
  mldose/mlprob files
* Fixed Issue #21: Remove "interaction_only" option. According to Yurii's
  post on the forum this option never worked as expected (Pull Request
  #23). Thanks to Matthias Wuttke for reminding us of this bug.
* Fixed bugs #5883 & #6010 (on R-forge): The main effect is displayed in
  the output with the  `interaction_only` option. Thanks to Maksim
  Struchalin & Lennart Karssen for fixing it and to Farid Radmanesh for
  reporting it. This was only a display error, computations were not
  affected.
* Fixed bug #5982: ProbABEL's make install fails on MacOS X and FreeBSD
  with sed error. Thanks to forum user mmold for reporting the bug.


***** v.0.4.5 (2015.05.26)
* Fixed bug #6054: "Not all ProbABEL's short command line options are
  correctly parsed."
* Fixed bug #6041: "ProbABEL's Cox module reports too many errors (beta
  may be infinite, setting beta and se to 'NaN')". Thanks to Anne
  Grotenhuis from the Radboud Medical Centre Nijmegen and forum user
  quentin300 for their extensive bug reports and help in testing the fix.
* Fixed bug #1266: "pacoxph with no covariates"; ProbABEL now also works
  when no covariates have been provided. Thanks to Aaron Joon for
  reporting this bug back in 2011 and to Anne Grotenhuis and to forum user
  quentin300 for pushing this bug to my attention and their help in beta
  testing.
* pacoxph now displays the regression equation correctly.
* Minor fix to the manual: the Rsq values in the info file should be >
  1e-16 (and not > 0 as mentioned before) otherwise the output will be set
  to 'nan'.
* Minor fix to the man-pages: according to the man-pages palogist,
  palinear and pacoxph all did regression using a linear model.


***** v.0.4.4 (2014.11.07)
* Fixed bug #5729 in the Cox PH module. Some checks for problems with the
  regression were incorrectly implemented. Thanks to Matthias Wuttke from
  the University Medical Centre Freiburg, Anne Grotenhuis from the Radboud
  Medical Centre Nijmegen, and Luba Pardo and Joris Verkouteren from the
  Erasmus Medical Centre Rotterdam for their time and effort in reporting
  the bug, helping to identify the problem and testing the fix.
* Backported fix of bug #5982: ProbABEL's make install fails on MacOS X
  and FreeBSD with sed error. Thanks to forum user mmold for reporting the bug.


***** v.0.4.3 (2014.04.01)
* Speed-up of a factor of ~ 2 for linear, logistic and Cox regression when
  using filevector input files.
* Fixed bug #5404: "ProbABEL's R check for Cox regression doesn't check if
  the survival package is installed".
* Fixed bug #5403: "The ProbABEL manual doesn't contain any information on
  how to install ProbABEL"


***** v.0.4.2 (2014.01.02)
* The 'probabel.pl' script is now simply renamed to 'probabel' (a user
  shouldn't care what scripting language we use). For at least several
  releases to come, the old script name will still exist (as a link to the
  original) and a warning message is displayed when the user runs the
  .pl script. This should give people time to adjust their pipelines.
* Fix bug #4919: Too small reading buffers for long alleles in mach info
  and legend files. Thanks to Daniel Taliun for reporting the bug and
  providing the patch. Thanks to Xia Shen for testing.
* Fix bug #4776: Specifying --sep="\t" as an option to pa* doesn't insert
  a TAB as separator. Now ProbABEL inserts proper tabs when specifying
  "\t". Thanks to Maksim Struchalin for fixing this bug.
* Improved convergence checks in the Cox PH regression module. The checks
  now give similar errors as R does.
* A minor change in the screen output of ProbABEL. Some of the status
  messages ("Reading phenotype data" etc.) have been added or move to a
  slightly different place in the code to help debugging problems with the
  input data.
* Fix a bug in the example scripts: an incorrect shell variable was used.
* The extract-snp binary is no longer built by default (as it isn't
  finished yet). Building can be enabled at compile time by giving the
  --enable-extract-snp option to ./configure.
* For developers: If R is installed, running 'make check' will also
  compare the results of ProbABEL with those of the same regressions in
  R.
* For developers: a start has been made on documenting the internal
  functions using Doxygen.


***** v.0.4.1 (2013.08.29)
* Fix bug #4854: When using mmscore, there is one (nan) column missing in
  the output for low-frequency SNPs. Also includes a simplification of the
  R-based test scripts.


***** v.0.4.0 (2013.08.25)
* The output files now contain a chi^2 column with the chi^2 value based
  on the LRT when not using --mmscore. When using --mmscore, the chi^2
  values are calculated from the Wald statistic (not implemented for the
  2df model yet).
* Fixed long-standing bug #1130: The Cox regression module of ProbABEL
  crashes (when using filevector files). Actually, the Cox PH module was
  broken ever since ProbABEL v0.1-9e. Thanks to the Erasmus Medical
  Center, Rotterdam (the group of prof. C.M. van Duijn) and the	Radboud
  University Medical Center, Nijmegen (the group of prof. Kiemeney) for
  partially sponsoring the time spent on this fix. Also thanks to Ben
  Verhaaren and Anne Grotenhuis for testing!
* Fixed bug #2575: ProbABEL chooses weird file names if the -o option is
  not specified. As a result of this fix, usage of the -o option with
  probabel.pl is slightly changed. Before the string specified after -o
  was added to the name of the phenotype file. Now it indicated the
  complete start of the file name (as before model-dependant strings like _2df
  will be added automatically).
* Fixed bug #2772: palogist with mmscore crashes (memory issue). Thanks to
  Anuj Goel for reporting this bug.
* Fixed bug #4700: wrong order of betas and se_betas in header of 2df
  output files. Thanks to Anne Grotenhuis for reporting this bug and to
  Yurii Aulchenko for solving it.
* Fixed bug #4683: probabel.pl writes to the wrong directory if a relative
  path is given on the command line (Thanks to Aaron Isaacs for reporting
  this bug). NOTE: As a result of this fix giving the -o option to
  probabel.pl is no longer interpreted as an addition to the default
  prefix (which was the name of the phenotype file), but as a full file
  name (incl. path).
* Fixed bug #2598: The prepare_data.R script is mentioned in the manual,
  but not distributed in .deb or .tar.gz
* Fixed bug #2529: ProbABEL doesn't warn when a required file is not
  specified on the command line
* Fix small bug where running "pa{linear,logist,coxph} -h" complained
  about a missing argument.
* The examples directory has been cleaned up and the test suite has been
  moved to a separate directory.


***** v.0.3.0 (2013.01.01)
* This is a major rewrite of several important parts of the ProbABEL
  code. ProbABEL can now make use of the Eigen matrix library
  (http://eigen.tuxfamily.org) for matrix operations. Especially when
  using the mmscore option this makes ProbABEL run ~ 5 times faster.
  Unfortunately the CoxPH module is still broken.
  This version completes several months of work by Maarten Kooyman who
  took up this challenge. Thanks to Maarten!
* Fixed bug #2436: probabel.pl doesn't overwrite output file (as it used
  to do before 0.2.2). This bug was introduced with the possibility of
  using SNP dosages/probabilities that are split into (sub-chromosomal)
  chunks.
* Fixed a small bug (without ID) where the column/row addressing of
  matrices as incorrectly checked (off-by-one error) (svn rev.1056).


***** v.0.2.2 (2012.11.05)
* No change in the code compared to v.0.2.1. Due to a mistake with the
  Ubuntu packaging (which was based on SVN r997, which contained a
  major bug in the tests and which was uploaded to the GenABEL PPA)
  I'm releasing a new package based on the same source code as
  ProbABEL v.0.2.1 (except for the version numbers of course).


***** v.0.2.1 (2012.11.05)
* Fixed bug #2295: the inverse variance-covariance matrix (used with
  the --mmscore option) was incorrectly subsetted when NAs are present
  for one or more SNP dosages (so this is not an issue for people using
  imputed data). As a result the invvarmatrix that was actually used in
  the regression contained rows and columns of zeroes. Thanks to Maarten
  Kooyman for reporting this bug.

* Fixed bug #1186: When .map file is missing (but --map option was
  given), the wrong error message was displayed. Thanks to Nicola
  Pirastu for reporting this bug.

* Fixed bug #2147: The value of the Rsq column in the info file should
  be > 0, unlike what was mentioned in the documentation.

* Update of the probabel.pl script and probabel_config.cfg.
  The .cfg file now accepts the chr separator in multiple locations in the path
  (thanks to Marijn Verkerk).
  probabel.pl can now also run Y chromosome analysis and the help
  message has been updated.

* probabel.pl and probabel_config.cfg now also accept chunks, where
  dose, prob, info and map files are split into multiple chunks. This
  is now the default for people following the 1000 genomes imputation
  cookbook for MaCH/minimac (the recipe uses the chunkchromosome tool
  to split the data into smaller pieces, speeding up imputation on
  computer clusters). See probabel_config.cfg.example for an
  example. (Lennart)



***** v.0.2.0 (2012.06.10)
* The v.0.1-9e fix for working with prob files in pacoxph has been
  forward-ported to this branch as well (Lennart and Yurii).

* pacoxph will not be built by default, because it is quite buggy. See
  ./configure notes below on how to enable building it anyway (Lennart).

* ProbABEL can now (experimentally!) analyze binary traits accounting
  for relationship structure (thanks to Yurii and Nicola Pirastu). This
  adds '--mmscore' option for logistic regression. Important: compared
  to 'palinear' the matrix which should be supplied to palogist with
  --mmscore should contain an inverse of the correlation matrix (not the
  inverse of var-cov matrix, as is the case for 'palinear' with
  --mmscore). This matrix can be obtained through (GenABEL notation):

h2.object$InvSigma * h2.object$h2an$estimate[length(h2.object$h2an$estimate)]

  where h2.object is the object returned by 'polygenic()'. Documentation
  explains this procedure in more details; a wrapper function to prepare
  and export correct objects for ProbABEL is planned. (thanks to Yurii)

* Small changes (thanks to Lennart).
- in probabel.pl the location for the probabel config file is now set
  to /etc/, the default location it will end up. Before it still
  pointed to the location the probabel.pl file was in. However, a
  better fix would be to somehow put in the actual value of the
  --sysconfdir option to ./configure.
- The PDF of the LaTeX documentation is now only generated if the
  pdflatex binary can be found. So now we also build the documentation
  using autotools.
- On 32bit Linux systems ProbABEL can now also use large (>4GB) input files.


* ProbABEL uses autoconf and automake now (thanks to Lennart).
  After downloading the source from SVN run
autoreconf -i

  to generate the ./configure script and some other files (this is not
  necessary when installing from the distributed .tar.gz file).

  To compile and install the package run
./configure
make
make check
make install

  This will install the binaries (palinear etc.) in /usr/local/bin/,
  the documentation in /usr/local/share/doc/probabel/, the
  probabel_config.cfg.example file in /usr/local/etc/, and the examples in
  /usr/local/share/ProbABEL/examples/. The ./configure script tests
  for the presence of the pdflatex program. If it is not present the
  PDF version of the documentation will not be built.

  For more information see the file doc/INSTALL.

In the ProbABEL autotools integration branch:
- Removed the old Makefile
- Added configure.ac that servers as input for auto(re)conf
- Added the Makefile.am files needed by automake to generate the final Makefiles.


***** v.0.1-9e (2011.05.15)
Previous version wouldn't compile because of missing frerror. file.
Fixes bug #1339 (the bug fix was done a month ago, but no package had
been release since then).

***** v.0.1-9d (2010.10.08)

fix related to compilation with updated 'filevector' library

***** v.0.1-9c (2010.08.01)

Bug identification and fix by Vadim Pinchuk (McGill). Here is his report:

I have found a small bug that caused reporting of incorrect
subject IDs in the error message when I tried to run palinear
analysis. It caused a bit of confusion on my side.
... I fixed it in downloaded copy of the source code and it reports
correct subject IDs. We were able to overcome the problems in the data
after the fix and submit analysis successfully.

Bravo, Vadim! Thumbs up, and many thanks from us (and of course all users)!


***** v.0.1-9b (2010.06.02)

changes in filevector part

***** v.0.1-9 (2010.05.28)

changes in filevector part

cox regression does not work

***** v.0.1-8 (2010.05.10)

changes in filevector part

***** v.0.1-6b (2010.04.22)

error message with missing map-file reported absence of
other file, fixed: thanks to N. Pirastu for noticing!

***** v.0.1-6 (2010.04.02)

bug fix in palogist, coxph (would not run with missing genotypic data present)

Thanks to Ida Surakka, Pau Navarro and Kati Kristiansson for spotting the problem!

***** v.0.1-5 (2010.03.07)

Use of updated FILEVECTOR (r300)

Fixed bug preventing coxph to run

***** v.0.1-4 (2010.02.12)

Instead chi2, log-likelihood is reported

ProbABEL can now work in very low memory mode, using filevector library
(see examples/prepare_data.R for examples how to convert data from
MACH to filevector format).

Can also treat missing values (NA, nan)

***** v.0.1-3 (2009.11.25)

This version is courtesy of Han Chen (Nov 9, 2009). The modifications allow
to extract the covariance between the estimate of beta(SNP) and the estimate
of beta(interaction). This information can be used to, e.g., to perform a
rigorous 2df meta-analysis of interactions later on.

Here is a more detailed description of what the changes concern (by Han Chen):

For model

Y = b_0 + b_cov1 * cov1 + b_cov2 * cov2 + ... + b_covN * covN + b_SNP * SNP + b_covX_SNP * covX * SNP

(1<=X<=N, covX can be any covariate in the phenotype file, from cov1 to covN,
option --interaction=X, see ProbABEL manual for details)
Or model

Y = b_0 + b_cov1 * cov1 + b_cov2 * cov2 + ... + b_covX-1 * covX-1 + b_covX+1 * covX+1 + ... + b_covN * covN + b_SNP * SNP + b_covX_SNP * covX * SNP

(1<=X<=N, covX can be any covariate in the phenotype file, from cov1 to covN,
option --interaction_only=X, see ProbABEL manual for details)
This "plus" version reports naive covariance (default) or robust covariance
(option --robust) between b_SNP and b_covX_SNP estimates, for palinear and
palogist (can NOT report covariance for pacoxph, Cox regression).
Covariances are no reported for options --score, --allcov, --mmscore

LINEAR REGRESSION OR LOGISTIC REGRESSION ONLY

Bravo, Han! -- and many thanks!


***** v.0.1-2 (2009.10.19)

Minor bug fix: allele frequency was estimated wrongly when option --ngpreds=2
was used. The bug had no effect anything else (betas, ses, chi2s were correct).
Many thanks to Han Chen (Boston University) for identification of this bug
and providing us with an excellent summary which allowed fixing the bug easily!

***** v.0.1-1 (2009.09.22)

Added "robust" option, which computes SEs using formula
(X'X)^(-1) (X' V X) (X'X)^(-1) where V is diagonal matrix containing
squares of residuals. In standard analysis, V is diagonal matrix
containing constant = RSS/N

***** v.0.1-0 (2009.08.15)

fixed a bug in phedata class: if ID was starting with 'N' or 'n', it
was counted as NA (thanks to Youfan & Ida Surakka for pointing the
bug out)


***** v.0.0-9 (2009.07.20)

mmscore bug fixed (ses of betas were inflated by sd of the trait's distribution)


***** v.0.0-8 (2009.06.09)

1. New interaction option added. Key --interaction_only=<number> allows you to perform interaction analysis without covariate acting in ineraction.
Examlple: trait ~ cov1 + cov2 + cov3*SNP
In previos versions only models like trait ~ cov1 + cov2 + cov3 + cov3*SNP were possible

2. mmoscore option modified. Now any amount of covariates is possible.


***** v.0.0-7 (2009.03.19)

Bug with interaction in survival analysis fixed.


***** v.0.0-6 (2009.01.28)

Added score test for association between a trait and genetic
polymorphism, in samples of related individuals.
Use key --mmscore <file with inverse of variance-covariance matrix>
File must contain the first column with id names like in phenotype file.
The rest is the matrix itself.
In case of mmscore phenotype file contain only two columns - id names and trait.

See example/mmscore.R and example/mmscore.sh.


***** v.0.0-5d (2009.01.11)

Possibility of analysis for SNP interaction has been added.
New key is --interaction=<covariate number>.
Default is --interaction=0. The first covariate is --interaction=1.

Output organizing has been changed. Now there is one file per one model.
Output file names consist of prefix which goes together with input
parameter --out and one of following postfixes:
"_2df.out.txt", "_add.out.txt", "_domin.out.txt", "_recess.out.txt", "_over_domin.out.txt".
If postfix is not given name "regression" is used.


File bin/probabel_system_example.pl has been changedand renamed
to probabel.pl_example. This script combines all ProbABEL functions and allows
you to organize work with your MACH output files. Just change configuration
file probabel_config.cfg_example and probabel.pl_example like it is written in manual.

Now chi2 is likelihood ration test when null model is without SNP and alternative - with SNP and interaction.


***** v.0.0-5c (2008.12.05)

Changed presentation of output: effective allele (A1) is mentioned
in exact manner


***** v.0.0-5b (2008.11.20)

Output from analysis with ngpred=2 (MLPROB files) fixed (models were
named wrongly)


***** v.0.0-5a (2008.10.05)

--allcov option added (allows output of estimates for all covariates)

example system-wide perl script running the analysis for
(potentially multiple cohorts) and joining single-chomosome
outputs to single file provided

A change in the procedure to read genotypic files, read format of the
beging of every line changed from "%d->%s" to "%[^->]->%[^->]" (thanks to
Bertram Muller-Myhsok & Benno Puetz)


***** v.0.0-5 (2008.06.07)

More than one genomic predictor per point: possibility to
work with MLPROB files (options --ngpreds added).


***** v.0.0-4 (2008.05.20)

Score test implemented (palinear, palogistic)

Logistic regression: eps-loglikelihood change as convergence criterium


***** v.0.0-3

Survival analysis added

SNP Z and Chi-square statistics added to output

Output generated in tab-delimited format

Documentation updated