analysis_id |
integer |
11 |
|
PRIMARY KEY, NOT NULL |
|
name |
varchar |
255 |
|
A way of grouping analyses. This
should be a handy short identifier that can help people find an
analysis they want. For instance "tRNAscan", "cDNA", "FlyPep",
"SwissProt", and it should not be assumed to be unique. For instance, there may be lots of separate analyses done against a cDNA database. |
|
description |
text |
64000 |
|
|
|
program |
varchar |
255 |
|
UNIQUE, NOT NULL, Program name, e.g. blastx, blastp, sim4, genscan. |
|
programversion |
varchar |
255 |
|
UNIQUE, NOT NULL, Version description, e.g. TBLASTX 2.0MP-WashU [09-Nov-2000]. |
|
algorithm |
varchar |
255 |
|
Algorithm name, e.g. blast. |
|
sourcename |
varchar |
255 |
|
UNIQUE, Source name, e.g. cDNA, SwissProt. |
|
sourceversion |
varchar |
255 |
|
|
|
sourceuri |
text |
64000 |
|
This is an optional, permanent URL or URI for the source of the analysis. The idea is that someone could recreate the analysis directly by going to this URI and fetching the source data (e.g. the blast database, or the training model). |
|
timeexecuted |
timestamp |
0 |
current_timestamp |
NOT NULL |
|
analysisfeature_id |
integer |
11 |
|
PRIMARY KEY, NOT NULL |
|
feature_id |
integer |
10 |
|
UNIQUE, NOT NULL |
feature.feature_id |
analysis_id |
integer |
10 |
|
UNIQUE, NOT NULL |
analysis.analysis_id |
rawscore |
float |
20 |
|
This is the native score generated by the program; for example, the bitscore generated by blast, sim4 or genscan scores. One should not assume that high is necessarily better than low. |
|
normscore |
float |
20 |
|
This is the rawscore but
semi-normalized. Complete normalization to allow comparison of
features generated by different programs would be nice but too
difficult. Instead the normalization should strive to enforce the
following semantics: * normscores are floating point numbers >= 0,
* high normscores are better than low one. For most programs, it would be sufficient to make the normscore the same as this rawscore, providing these semantics are satisfied. |
|
significance |
float |
20 |
|
This is some kind of expectation or probability metric, representing the probability that the analysis would appear randomly given the model. As such, any program or person querying this table can assume the following semantics:
* 0 <= significance <= n, where n is a positive number, theoretically unbounded but unlikely to be more than 10
* low numbers are better than high numbers. |
|
identity |
float |
20 |
|
Percent identity between the locations compared. Note that these 4 metrics do not cover the full range of scores possible; it would be undesirable to list every score possible, as this should be kept extensible. instead, for non-standard scores, use the analysisprop table. |
|