1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354
|
<?xml version="1.0"?>
<!DOCTYPE book PUBLIC "-//OASIS//DTD DocBook XML V4.2//EN" "http://www.oasis-open.org/docbook/xml/4.2/docbookx.dtd">
<book>
<bookinfo>
<title>Tablix modules HOW-TO, part 2</title>
<author>
<firstname>Tomaž</firstname>
<surname>Šolc</surname>
<authorblurb>
<para><email>tomaz.solc@tablix.org</email></para>
</authorblurb>
</author>
<pubdate>
$Id: modules2.db,v 1.4 2006-08-29 14:32:23 avian Exp $
</pubdate>
<abstract>
<para>
Export modules are pieces of code that are dynamically linked with the <filename>tablix2_output</filename> utility at run time and provide functions for exporting data in internal kernel structures to various file formats. This document describes in detail how to write and build new export modules. It also briefly explains how to use kernel API interfaces for this kind of modules.
</para>
</abstract>
<legalnotice>
<para>
Copyright (C) 2005 by Tomaž Šolc.
</para>
</legalnotice>
<mediaobject>
<imageobject><imagedata align="center" fileref="images/lines3.pdf" format="EPS"/></imageobject>
<imageobject><imagedata align="center" fileref="images/lines3.png" format="PNG"/></imageobject>
</mediaobject>
</bookinfo>
<toc></toc>
<chapter>
<title>Introduction</title>
<para>
Since Tablix version 0.0.6 each file format supported by the <filename>tablix_output</filename> utility is handled by a separate export module. The export module interface changed considerably with the kernel rewrite during 0.2.x branch.
</para>
<para>
There are two distinct types of modules: <phrase>Fitness modules</phrase> contain <phrase>partial fitness functions</phrase> and provide handlers for various <phrase>restrictions</phrase>. They are loaded by the kernel as specified in the XML configuration file. <phrase>Export modules</phrase> on the other hand are loaded by the <filename>tablix2_output</filename> utility and contain functions that translate data from the internal kernel structures to a file in a certain format. For example: HTML export module, comma separated value format export module. </para>
<para>
Fitness and export modules access the kernel data structures in more or less the same way. Description of this common interface can be found in the first part of this HOW-TO and will not be repeated here. This document describes only the parts of the interface that are specific to the export modules.
</para>
<para>
I recommend reading the Tablix User's Manual and the first part of this HOW-TO before attempting to write your own module. Also while reading this text you should keep a browser window nearby with the Tablix kernel API reference manual loaded. Some important functions are also described in the text, but mostly only references to the reference manual will be given.
</para>
<!--
<note>
<para>
All example XML configuration files and module source code can be found in the <filename>examples/modules/</filename> subdirectory in the Tablix source tree.
</para>
</note>
-->
</chapter>
<chapter>
<title>
Kernel API
</title>
<sect1>
<title>
Export function
</title>
<para>
The export module interface is very simple compared to fitness modules. Each export module must contain only one function, called <function>export_function()</function>. Its single purpose is to convert the data stored in the kernel data structures into a stream of characters and store it into one or more files.
</para>
<para>
Each time user runs <filename>tablix2_output</filename> with the proper command line arguments, the utility loads the requested export module, parses the XML file and initializes kernel data structures and calls the <function>export_function()</function>.
</para>
<para>
The prototype for <function>export_function()</function> can be found in <filename>output.h</filename>:
</para>
<programlisting>
typedef int (*<function>export_f</function>)(table *<parameter>tab</parameter>, moduleoption *<parameter>opt</parameter>, char *<parameter>filename</parameter>);
</programlisting>
<para>
<parameter>tab</parameter> is a pointer to the <structname>table</structname> structure. This structure describes the timetable that should be exported - it contains pointers to the chromosome structures.
</para>
<para>
<parameter>opt</parameter> is a pointer to the linked list of module options. Module options are passed to the <filename>tablix2_output</filename> utility with the <parameter>-s</parameter> argument in the following form:
</para>
<screen>
-s option1=value,option2=value,...
</screen>
<para>
You can access this linked list in the same way as in fitness modules using functions <function>option_int()</function>, <function>option_str()</function> and <function>option_find()</function>.
</para>
<para>
The final argument <parameter>filename</parameter> is a string holding the name of the file to be written. The <filename>tablix2_output</filename> utility simply passes this file name from its command line to the export function. The utility itself does not care if the export function actually writes anything to this file. For example, if the export function needs to write more than one file, it can use this argument as the name of the directory to put the files in. <parameter>filename</parameter> could also contain the location in a database where the timetable data should be inserted.
</para>
<note>
<para>
<parameter>filename</parameter> can be equal to <parameter>NULL</parameter>. The export function should treat that condition as a request to write on the standard output. If this is not possible (i.e. more than one file needs to be written) then the export function should return an error.
</para>
</note>
<para>
The <function>export_function()</function> should return 0 on success and -1 on error. Functions <function>error()</function>, <function>info()</function> and <function>debug()</function> can be used to report various warnings and errors back to the user.
</para>
</sect1>
<sect1>
<title>
Compiling your module
</title>
<para>
As with fitness modules there are also two ways of compiling your export modules. You can use your own <filename>Makefile</filename> or you can modify the <filename>Makefile.am</filename> supplied in the distribution and compile your module in the exactly the same way as the official modules. See the first part of this HOW-TO for details. If you would like to use the second method, please note that the source for the export modules is in the <filename>export/</filename> subdirectory in the Tablix source tree.
</para>
<para>
Export modules are usually stored in the same directory as the fitness modules. While fitness modules can have arbitrary names, all export modules must have the prefix "export_" in their names. <filename>tablix2_output</filename> utility gets the name of the export module to load by concatenating "export_" and the name of the export format. For example, the following command line:
</para>
<screen>
<prompt>$</prompt> tablix2_output csv result0.xml
</screen>
<para>
will cause the <filename>tablix2_output</filename> utility to load the <filename>export_csv.so</filename> export module.
</para>
</sect1>
<sect1>
<title>
Output extensions
</title>
<sect2>
<title>
Description
</title>
<para>
As you probably noticed, export function can only access the timetable in its basic chromosome form (fitness functions can also access slist and extension forms). It has already been mentioned that chromosome extension form resembles a human readable timetable format. Because of this it is often useful to use this form in the export function instead of the chromosome form.
</para>
<para>
However, the chromosome extension that is used in fitness functions usually isn't suitable for use in export modules because events (tuples) can be lost in the conversion from the chromosome form. Export modules therefore use another form called <phrase>output extension</phrase> that does not have this drawback.
</para>
<para>
The output extension <structname>outputext</structname> structure closely resembles the normal chromosome extension <structname>ext</structname> structure. It also defined by one constant and one variable resource type and also consists of a two-dimensional array. However instead of single tuple IDs in the normal extension the two-dimensional array now holds lists of tuples (stored in the <structname>tuplelist</structname> structure.
</para>
<para>
Consider the following example: if you construct an output extension <parameter>outputext</parameter> for a constant resource type with type ID <parameter>con_typeid</parameter> and a variable resource type with type ID <parameter>var_typeid</parameter>, then the following element in the two-dimensional array:
</para>
<programlisting>
outputext.list[<parameter>c</parameter>][<parameter>v</parameter>]
</programlisting>
<para>
is a pointer to the <structname>tuplelist</structname> structure which holds a list of tuple IDs of tuples (events) that are using both the constant resource with resource ID <parameter>c</parameter> and type ID <parameter>con_typeid</parameter> and the variable resource with resource ID <parameter>v</parameter> and type ID <parameter>var_typeid</parameter>.
</para>
<note>
<para>
If you don't understant this example, see the section on chromosome extensions in the first part of the HOW-TO (specially the part about visualization). Keep in mind that in most cases the variable resource type used in extensions is time.
</para>
</note>
<para>
The <structname>tuplelist</structname> structure holds a simple array of tuples. The field <structfield>tupleid</structfield> is an array of <structfield>tuplenum</structfield> tuple IDs.
</para>
</sect2>
<sect2>
<title>
Associated functions
</title>
<para>
Three functions are available for converting the chromosome form of the timetable to an output extension: <function>outputext_new()</function>, <function>outputext_update()</function> and <function>outputext_free()</function>. Following example demonstrates their use:
</para>
<programlisting>
int export_function(table *tab, moduleoption *opt, char *file)
{
outputext *ext;
ext=outputext_new("dummy-constant-type", "dummy-variable-type");
outputext_update(ext, tab);
...
outputext_free(ext);
return(0);
}
</programlisting>
<para>
<function>outputext_new()</function> function allocates a new output extension structure and initializes its values. The first argument is the name of the constant resource type and the second argument is the name of the variable resource type. In case memory allocation fails or the requested resource types are not found, this function returns NULL. The example above doesn't do any error checking - a proper export module should report this condition as an error and about the execution.
</para>
<para>
<function>outputext_update()</function> function fills the two-dimensional array in the output extension with values from the timetable structure. After this function is called, the output extension is ready to use.
</para>
<para>
<function>outputext_free()</function> function frees any allocated memory taken by the output extension. It should be called after the extension is no longer needed to prevent memory leaks.
</para>
</sect2>
</sect1>
<sect1>
<title>
Internationalization
</title>
<para>
All character strings in the kernel structures are always UTF-8 encoded, even if the input XML file was in some other encoding. If your export format requires some other character encoding, you can use the libiconv library to transcode strings into other encodings.
</para>
<para>
If your exported format includes any messages that can be translated into other languages, please enclose them in a gettext translation macro like this:
</para>
<programlisting>
fprintf(file, _("Hello world!"));
</programlisting>
<para>
This way messages in your export module will be included in the Tablix translations. See <ulink url="http://www.gnu.org/software/gettext/manual/gettext.html">GNU gettext documentation</ulink> for more information.
</para>
</sect1>
</chapter>
<chapter>
<title>
Example export modules
</title>
<sect1>
<title>
Comma separated values
</title>
<para>
Following is the source code of the <filename>export_csv.so</filename> export module. This is a simple export module that exports timetable data into a "comman separated values" (CSV) format that is suitable for import into other programs (for example spreadsheets). CSV format requires each line to be separated into multiple fields by the "," separator character. Fields containing strings must also be enclosed in double quotes. Fields with numbers are without quotes.
</para>
<para>
Output consists of four header lines and a table of events. First three lines contain the title, author and the address of the institution. Then comes a line that contains the fitness of the exported timetable and a header line for the table of events. Each event is represented as a single line. First field contains the name of the event and subsequent fields contain names of the resources used by this event, sorted by resource type.
</para>
<programlisting>
#include "export.h"
int export_function(table *tab, moduleoption *opt, char *file)
{
int typeid,tupleid;
FILE *out;
char *name;
int resid;
assert(tab!=NULL);
if(file==NULL) {
out=stdout;
} else {
out=fopen(file, "w");
if(out==NULL) fatal(strerror(errno));
}
fprintf(out, "\"Title\",\"%s\"\n", dat_info.title);
fprintf(out, "\"Address\",\"%s\"\n", dat_info.address);
fprintf(out, "\"Author\",\"%s\"\n", dat_info.author);
fprintf(out, "\"Fitness\",%d\n", tab->fitness);
fprintf(out, "\"Event name\"");
for(typeid=0;typeid<dat_typenum;typeid++) {
fprintf(out, ",\"%s\"", dat_restype[typeid].type);
}
fprintf(out, "\n");
assert(dat_typenum==tab->typenum);
for(tupleid=0;tupleid<dat_tuplenum;tupleid++) {
fprintf(out, "\"%s\"", dat_tuplemap[tupleid].name);
for(typeid=0;typeid<dat_typenum;typeid++) {
assert(dat_tuplenum==tab->chr[typeid].gennum);
resid=tab->chr[typeid].gen[tupleid];
name=dat_restype[typeid].res[resid].name;
fprintf(out, ",\"%s\"", name);
}
fprintf(out, "\n");
}
if(out!=stdout) fclose(out);
return 0;
}
</programlisting>
<para>
As you can see, this export module consists of only the export function. A more complicated export module could contain additional functions called from the export function. The <filename>export.h</filename> header contains all prototypes and type definitions required in export modules.
</para>
<para>
Export function first opens the output file. As mentioned in the introduction, if the <parameter>filename</parameter> is NULL, then we write to the standard output instead of a file.
</para>
<para>
Following are four calls to the <function>fprintf()</function> function to write the header lines. Title of the exported timetable and information about the author can be found in the <varname>dat_info</varname> global variable (<structname>miscinfo</structname> structure). The fitness of the timetable can be found in the <structfield>fitness</structfield> field of the <structname>table</structname> structure.
</para>
<para>
Next we write the header of the event table. First column holds the name of the event, second column holds the resource of the first resource type, third column resource of the second resource type and so on. The first "for" loop therefore iterates through all resource types and prints out their names.
</para>
<para>
Following are two nested "for" loops that print out the whole event table. Outer loop iterates through all events (the current tuple ID is in the <varname>tupleid</varname> variable). We print the name of the event at the beginning of the line and enter the second loop. This loop again iterates through all resource types and gets the resource ID (<varname>resid</varname> variable) of the resource that the current event is using for each resource type. We then look up the name of this resource in the <varname>dat_restype</varname> global variable that holds pointers to all <structname>resource</structname> structures and print it.
</para>
</sect1>
</chapter>
</book>
|