File: README

package info (click to toggle)
mssstest 3.0-2
  • links: PTS, VCS
  • area: non-free
  • in suites: squeeze
  • size: 128 kB
  • ctags: 35
  • sloc: cpp: 645; makefile: 10
file content (231 lines) | stat: -rw-r--r-- 10,441 bytes parent folder | download | duplicates (5)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
			M S S S t e s t
			===============


MSSStest is a program for implementing the method described in the
article The Multiple Sclerosis Severity Score.  R. Roxburgh S. Seaman
et al. (2004), accepted for publication by Neurology.  It calculates
MSSS scores and uses these scores to test for differences between
disease progression rates in different groups defined by genotype at
some locus.  A brief description of the method is given below.

Introduction
============

Suppose one is interested in determining whether the genotype at some
locus of interest affects the rate of progression of disease of people
with Multiple Sclerosis (MS).  One method would be to calculate the
mean EDSS score in patients with each of the genotypes and test whether
they are significantly different.  However, this approach is
inefficient if patients have their EDSS assessed at different durations
of disease, as patients who have had MS longer typically have higher
EDSS scores regardless of their genotype.  By adjusting for duration, a
more powerful test can be developed.

MSSS Method
===========

Given a dataset of EDSS scores measured on patients at different
durations of disease, a table is generated to convert ESSS scores to
MSSS scores.  This is done as follows.  For each i (i=0, ..., 30), all
patients having durations between i-2 years and i+2 years are ranked
according to their EDSS scores.  Suppose there are Ni such patients.
Then, for each possible EDSS value (0, 1, 1.5, ..., 9.5), the average
rank for that value is calculated.  Suppose that for year i and an EDSS
of j, this average rank is Rij.  Rij is divided by Ni + 1 and then
multiplied by 10, to yield a number between 0 and 10.  This is the MSSS
score for a person whose EDSS is measured as j when their duration is i
years.  This procedure is repeated for each value of i in turn.

A patient having an MSSS of x progresses faster than 10x% of MS
patients in the population and slower than (100-10x)%.  So, for
example, an MSSS of 5.0 means progressing at the median rate.  A
patient whose MSSS is 9.0 is a fast progressor, progressing faster than
90% of patients.  A patient whose MSSS is 1.0 is a slow progressor,
progressing faster than just 10% of patients.

Using data on 9892 patients from 11 (mainly European) countries, we
derived an MSSS table.  This is the `Global MSSS' table, which can be
used to look up, for each patient in a study, the MSSS score that
corresponds to their EDSS and duration.  However, an alternative is to
generate an MSSS table from any particular set of data being analysed.
This is a `Local MSSS' table.

Once MSSS scores have been assigned to patients (whether using the
Global or a Local table), the Kruskal-Wallis test can be used to
compare median MSSS in the different genotype groups.  The
Kruskal-Wallis test is similar to the ANOVA test, but is a
non-parametric test, i.e. it does not assume normally distributed
data.  If there are only two genotype groups, the Kruskal-Wallis test
is identical to the Wilcoxon test (also known as the Mann Whitney U
test).  Patients with duration 0 years are excluded from the test
because EDSS assessments in the first year are not adequately
predictive of later disease progression.

The MSSStest program performs this Kruskal-Wallis test, having first
assigned MSSS scores using either the Global MSSS table or a Local MSSS
table generated from the data provided.  We recommend that in nearly
all cases, the Global table should be used.  A Local table should be
used only if the sample is large (>1000 patients) and the method of
recruitment of patients is such that the distribution of EDSS
conditional on duration is very different from that of the combined
cohort of 9892 patients used to calculate the Global table.  This could
be, for example, if a study recruited only fast and/or slow
progressors.

P-values for the Kruskal-Wallis test can be obtained either by using an
asymptotic (large sample) approximation or by an exact, permutation
method.  The asymptotic method should be fine for large samples.
However, if the sample is small, ask MSSStest to also calculate a
permutation p-value.  In most cases, this should be very similar to
that calculated by the asymptotic method.  If they are very different,
the sample is too small for the asymptotic approximation to be reliable
and the permutation p-value should be used instead.

Another application of the MSSS is to describe disease severity in a
single dataset. If all patients entered into MSSStest are in a single
dataset it will calculate the mean Global MSSS for this dataset but
will not perform a Kruskal-Wallis test.  The mean is calculated
excluding patients with duration 0 years because EDSS assessments in
the first year are not adequately predictive of later disease
progression.

Input files
===========

MSSStest requires as input files:

edss.txt
This should be a tab or space delimited text file (ASCII) file.  It
should contain one row for each individual in the data set.  Five
columns are required.  The first two columns are simply labels for the
individuals in the data set.  They are not used to perform the test.
The first column is the family number and the second column is the
number of the individual in the family.  The third column is for the
EDSS score.  The fourth is for the disease duration at the time the
EDSS was measured, rounded down to the next lowest whole year.  The
fifth is for the group to which the individual belongs.  Groups are
coded numerically and should be between 0 and 99.  An example edss.txt
file is provided.  Note that no header row is allowed.

global.dat
This is the Global MSSS table: a matrix for converting EDSS scores into
MSSS scores.  It is not necessary to understand it, but if you are
interested, row i corresponds to an EDSS score of i/2 (e.g. the fifth
row is for EDSS=2.5) and column j corresponds to a duration of j-1
years.

Output files
===========

MSSStest produces the two output files.

msss.out
This contains a summary of the results of the Kruskal-Wallis test or in
the case of a single test just the mean MSSS.

indivmsss.out
This is the same as edss.txt, except that an extra, sixth column is
added which contains each individual's MSSS scores.


Obtaining MSSStest
==================

Versions of MSSStest are available for Windows and Linux - see
the download page:

	http://www-gene.cimr.cam.ac.uk/MSgenetics/GAMES/MSSS/download.html

If you wish to use the program on a platform other than Windows or
Linux, we also provide source code on the download page. MSSStest is
written in C++ and may be compiled using a suitable C++ compiler.  The
files roxburgh.cpp and neededroutines.cpp are C++ source files and
neededroutines.h is a "header" file.

In unix, compilation would normally be by:

	CC -o MSSStest.exe neededroutines.cpp roxburgh.cpp -lm -lc

where CC is the command name of the compiler.  This produces the
executable file MSSStest.exe.  Alternatively, a makefile is also
included (this may require editing for your C++ compiler).

Testing the program
===================

We provide a test file, edss.dat.  This contains 250 patients,
belonging to three genotype groups.  When these data are analysed using
the Global MSSS, the p-value is 0.69.  When they are analysed using
Local MSSS (which we do not recommend for such a small dataset), it is
0.62.

The data files edss.txt and global.dat need to be in the same folder as
the file mssstest.exe. The way the program is launched in Windows is
simply by double clicking on the mssstest.exe icon (once it has been
unzipped).


Licensing of MSSStest
=====================

This software is downloaded at the user's risk.  Anyone may copy it but
no one may sell it.

We make no representation or warranties with respect to the MSSStest
software and specifically disclaim any implied warranties of
merchantability and fitness for a particular purpose.  We reserve the
right to revise the MSSStest software and to make changes therein from
time to time without obligation to notify any person or organisation of
such revision or changes.  While we make every effort to ensure the
reliability of the MSSStest software we may not be held responsible for
errors, omissions or other inaccuracies or any consequences thereof.

THIS SOFTWARE IS PROVIDED ``AS IS'' AND ANY EXPRESS OR IMPLIED
WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF
MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN
NO EVENT SHALL SHAUN SEAMAN OR RICHARD ROXBURGH OR THEIR EMPLOYERS BE
LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR
BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY,
WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR
OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF
ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.

Version 2.0 September 2004
==========================

The new features of this version are 
A)The program will calculate mean EDSS where all patients are included
in one group. This can then be used as a general descriptor of
progression in that group.

B)While the program still accepts data from patients with disease
duration less than one year ie whose disease duration in is given as
"0" and will calculate a nominal MSSS for these patients these patients
are not included in the calculation of mean MSSS nor used in the
Kruskal Wallis tests. Note entering all patients in a group with
disease duration "0" will cause it to crash.

Version 3.0: 1st August 2007
============================

Versions of the program prior to this date had an error which made the
file indivmsss.out unreliable when patients of duration 0 were
entered.  Individuals' MSSS scores as listed in this file were
correct.  However they were assigned to the wrong group if there were
patients with duration 0 in the input file. This did not affect the
Kruskal Wallis analysis or p values but did mean that calculated mean
scores for each group were unreliable. MSSStest version 3.0 has this
bug corrected. We are grateful to Yoav Ben-Shlomo (Dept. of Social
Medicine, University of Bristol) for bringing this error to our
attention.

Shaun Seaman and Richard Roxburgh
2 August 2007

Max Planck Institute for Psychiatry
Kraepelinstr. 2-10
Munich
Germany