File: version-3-2.text

package info (click to toggle)
autoclass 3.3.4-6
  • links: PTS
  • area: main
  • in suites: lenny
  • size: 3,844 kB
  • ctags: 994
  • sloc: ansic: 16,674; makefile: 123; sh: 98; cpp: 95; csh: 77
file content (221 lines) | stat: -rw-r--r-- 9,750 bytes parent folder | download | duplicates (7)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221


	AUTOCLASS C VERSION 3.2 NOTES 
====================================================================== 
======================================================================

Documentation:
------------------------------

  1. autoclass-c/doc/search-c.text -
        Added a new section: 14.0 How to get AutoClass C to Produce 
     Repeatable Results.

     Added information about running AutoClass C with more than 1000
     attributes in sections: 10.0 Do I Have Enough Memory and Disk Space?

     Changed the behavior of search parameter force_new_search_p in 
     order to prevent search trials from being inadvertently lost:
     if TRUE, will ignore any previous search results, discarding the 
     existing .search & .results[-bin] files after confirmation by the 
     user; if FALSE, will continue the search using the existing 
     .search & .results[-bin] files.  The default value of
     force_new_search_p is now true.

  2. autoclass-c/doc/interpretation-c.text -
        Added section headings and a new section entitled: Comparing 
     Influence Report Class Weights And Class/Case Report Assignments

  3. autoclass-c/doc/preparation-c.text -
        Added more to section: 1.2.1  SINGLE_NORMAL_CN/CM and 
     MULTI_NORMAL_CN Models

  4. autoclass-c/doc/reports-c.text -
        Improved the last pargraph of Generating Sigma Contour Values. 

        Replace parameters start_sigma_contours_att and 
     stop_sigma_contours_att with sigma_contours_att_list, to allow 
     non-contiguous groups of attributes to be specified.


Programming:
------------------------------						

  1. autoclass-c/prog/globals.c -
        Update "G_ac_version" to 3.2.

  2. autoclass-c/prog/intf-reports.c -
        In INFLUENCE_VALUES_HEADER, change
     `fprintf( influence_report_fp, header);' to
     `fprintf( influence_report_fp, header, "");', and in
     CLASS_WEIGHTS_AND_STRENGTHS and CLASS_DIVERGENCES add args to 
     output_title fprintf for new page -- this prevents 
     segmentation faults, when the number of attributes exceeds one
     page, while in report_mode = "text".

  3. autoclass-c/prog/intf-sigma-contours.c -
        In COMPUTE_SIGMA_CONTOUR_FOR_2_ATTS, corrected initialization 
     of *rotation. This corrects erroneous values of the contour's
     rotation.

  4. autoclass-c/prog/struct-class.c -
        Correct compiler warning "struct-class.c:239: warning: 
     unused variable `database'".
  
  5. autoclass-c/prog/struct-data.c, globals.h, globals.c, search-control.c -
        In EXPAND_DATABASE, use comp_database->n_data rather than 
     G_s_params_n_data, since G_s_params_n_data does not do the right thing
     when expand_database is called during report generation (it reads
     the whole file, not just n_data cases).  Remove references to
     G_s_params_n_data from the 2nd to 4th files.

  6. autoclass-c/prog/intf-reports.c -
        In XREF_GET_DATA, allocate more storage for instance class 
     probabilities if there are more than MAX_NUM_XREF_CLASS_PROBS, and 
     only save for printing a maximum of MAX_NUM_XREF_CLASS_PROBS classes.

     IMPORTANT NOTE: This bug fix means that for any previous reports
                     generated by AutoClass C, any data base instance
     which has five class probability entries in the class cross-reference 
     report, and 1.0 minus the sum of the five probabilities is greater 
     than the largest of them, is in the WRONG CLASS!  Re-run the reports
     with this version!

  7. autoclass-c/prog/autoclass.c -
        Print the AutoClass C version when the user invokes AutoClass
     with no arguments: % autoclass

  8. autoclass-c/load-ac -
        Specified define flags for SunOS gcc and Solaris gcc compilations
     to prevent compiler warnings. Added IRIX 6.4 compatibility.

  9.  autoclass-c/prog/autoclass.h -
        For gcc under SunOS, include function prototypes for *rand48 functions,
      to prevent compiler warnings.

 10. autoclass-c/prog/intf-reports.c -
        Add descriptive text for each influence value class parameter for
     reports with parameter report_mode = "text".

 11. autoclass-c/prog/autoclass.make.solaris.cc -
        Corrected optimization flag.

 12. autoclass-c/prog/intf-reports.c -
        In FORMAT_REAL_ATTRIBUTE, correct correlation matrices print-out 
     for non-contiguous model term attributes, and print matrices only once,
     after all class attributes are listed.

 13. autoclass-c/prog/search-control.c -
        In AUTOCLASS_SEARCH, if force_new_search_p is false, exit if there 
     is no <...>.results[-bin] file.  Make TRUE the default for 
     force_new_search_p.

 14. autoclass-c/prog/intf-reports.c -
        In PRINT_ATTRIBUTE_HEADER, remove references to INTEGER attribute type.

 15. autoclass-c/prog/getparams.c -
        In GETPARAMS, correct logic so that missing "line feed" on last line
     of the file will be read properly, rather than getting:
     ERROR: line read exceeds 100 characters: <.....>.
        In GETPARAMS, correct logic so that an empty integer list (e.g. 
     start_j_list =) may be entered in the .s-params file.  This is needed
     for a restart search situation when it is necessay to peel off as many 
     classes from the start_j_list as were already done by the previous run.
     If all of the start_j_list was done already, then an empty list is
     required.

 16. autoclass-c/prog/io-read-data.c, io-results.c, io-results-bin.c -
        In READ_DATA, EXPAND_CLSF_WTS, and LOAD_CLASS_DS_S  add checks for 
     "out of memory" returns from malloc and realloc.

 17. autoclass-c/prog/io-results.c -
        In MAKE_AND_VALIDATE_PATHNAME, VALIDATE_RESULTS_PATHNAME,
     VALIDATE_DATA_PATHNAME, and GET_CLSF_SEQ change strchr to strrchr 
     to handle `../filename.extension'

 18. autoclass-c/prog/autoclass.h, predictions.c, search-basic.c, &
     search-control.c -
        Notify the user with a warning messasge and an option to exit from
     an initial classification run, if the data set size is greater than
     1000.  The messasge is "WARNING: the default start_j_list may not 
     find the correct number of classes in your data set!".

 19. autoclass-c/prog/autoclass.h, autoclass.c, & intf-reports.c -
        Write -reports option screen output to log file.

 20. autoclass-c/prog/io-read-data.c -
        In FIND_DISCRETE_STATS, when the number of discrete value 
     translators is less than attribute definition range, reduce the
     range and output an advisory, rather than outputting warning
     message and asking the user whether to proceed or not.

     The above change was REMOVED, since it caused an incompatablility with 
     previous results files: "ERROR: expand_database found unmatched common 
     attributes defs in <.results[-bin] file> and ........

 21. autoclass-c/prog/global.h, global.c, search-control-2.c, & search-control.c -
        Warn user of search trials which do not converge, which means that
     their number of try cycles reached the value of the "max_cycles" search 
     parameter.  Do this by printing a warning message after the trial completes.
     Also after the "SUMMARY OF n BEST RESULTS" at the conclusion of each
     run, print "SUMMARY OF TRY CONVERGENCE" for the n best results.

 22. autoclass-c/prog/model-multi-normal-cn.c -
        It was recently brought to our attention that the multi-normal
     model, with more than about 10 attributes and several thousand
     instances, would consistently run to the the max_duration or
     max_n_tries limit, regardless of how large those limits were.
     Suitably instrumented experiments showed that EM (expectation
     maximization) was actually oscillating.  The problem was traced 
     to a conceptual error in the underflow limiting code that 
     constrains the estimation of empirical standard deviations.  
     This has been corrected.  However users should be alert for, 
     and report, any further problems of this nature.
 
 23. autoclass-c/prog/autoclass.h, intf-reports.c -
        For MNcn attributes, do not sort them within their model term
     when order_attributes_by_influence_p = false.  The outputing of
     MNcn correlation matrices after last class attribute, instead of 
     after each term, is now done by a call to
     GENERATE_MNCN_CORRELATION_MATRICES from
     AUTOCLASS_CLASS_INFLUENCE_VALUES_REPORT.

 24. autoclass-c/prog/intf-reports.c, intf-sigma-contours.c -
        Replace report parameters start_sigma_contours_att and 
     stop_sigma_contours_att with sigma_contours_att_list, to allow 
     non-contiguous groups of attributes to be specified.

        Check for attribute indices of reports parameter
     sigma_contours_att_list which are declared "ignore" by the .model file.
     Prevents segmentation fault.

        Correct erroneous rotations for non-covariant pairs of attributes
     modeled in two different covariant normal terms (the rotations in these
     cases should be 0.0).

 25. autoclass-c/prog/intf-reports.c -
        Previously when specifying report_type = "xref_case" or
     report_type = "xref_class" along with n_clsfs > 1 or  clsf_n_list with 
     more than 1 list element, the .case-text-n or .class-text-n data would 
     be identical.  Sometimes segmentation faults would occur.  This has
     been corrected.  This was not a problem for report_type = "all" 
     (the default).  Also when using the default for report_type ("all"),
     previously the memory allocated for each classification's cross
     reference was not deallocated after each classification was processed.
     It is now properly deallocated.



======================================================================