File: internals.xml

package info (click to toggle)
libapache2-mod-rivet 2.3.3-1
links: PTS
area: main
in suites: stretch
size: 5,156 kB
ctags: 1,093
sloc: xml: 7,696; tcl: 6,939; ansic: 5,682; sh: 4,862; makefile: 199; sql: 91; lisp: 78
file content (262 lines) | stat: -rw-r--r-- 11,838 bytes
<!-- $ Id: $ -->

  <section id="internals">
    <title>Rivet Internals</title>
    <para>
      This section easily falls out of date, as new code is added, old
      code is removed, and changes are made.  The best place to look
      is the source code itself.  If you are interested in the changes
      themselves, the Subversion revision control system
      (<command>svn</command>) can provide you with information about
      what has been happening with the code.
    </para>
    <section>
      <title>Initialization</title>
      <para>
			When Apache is started, (or when child Apache processes are
			started if a threaded Tcl is used),
			<function>Rivet_InitTclStuff</function> is called, which
			creates a new interpreter, or one interpreter per virtual
			host, depending on the configuration. It also initializes
			various things, like the <structname>RivetChan</structname>
			channel system, creates the Rivet-specific Tcl commands, and
			executes Rivet's <filename>init.tcl</filename>.  The caching
			system is also set up, and if there is a
			<command>GlobalInitScript</command>, it is run.
      </para>
    </section>
    <section>
      <title>RivetChan</title>
      <para>
			The <structname>RivetChan</structname> system was created in
			order to have an actual Tcl channel that we could redirect
			standard output to.  This lets us use, for instance, the
			regular <command>puts</command> command in .rvt pages.  It
			works by creating a channel that buffers output, and, at
			predetermined times, passes it on to Apache's IO system.
			Tcl's regular standard output is replaced with an instance of
			this channel type, so that, by default, output will go to the
			web page.
      </para>
    </section>
    <section>
      <title>The <command>global</command> Command</title>
      <para>
			Rivet aims to run standard Tcl code with as few surprises as
			possible.  At times this involves some compromises - in this
			case regarding the <command>global</command> command.  The
			problem is that the command will create truly global
			variables.  If the user is just cut'n'pasting some Tcl code
			into Rivet, they most likely just want to be able to share the
			variable in question with other procs, and don't really care
			if the variable is actually persistant between pages.  The
			solution we have created is to create a proc
			<command>::request::global</command> that takes the place of
			the <command>global</command> command in Rivet templates.  If
			you really need a true global variable, use either
			<command>::global</command> or add the :: namespace qualifier
			to variables you wish to make global.
      </para>
    </section>
    <section>
      <title>Page Parsing, Execution and Caching</title>
      <para>
			When a Rivet page is requested, it is transformed into an
			ordinary Tcl script by parsing the file for the &lt;? ?&gt;
			processing instruction tags.  Everything outside these tags
			becomes a large <command>puts</command> statement, and
			everything inside them remains Tcl code.
      </para>
      <para>
			Each .rvt file is evaluated in its own
			<constant>::request</constant> namespace, so that it is not
			necessary to create and tear down interpreters after each
			page.  By running in its own namespace, though, each page will
			not run afoul of local variables created by other scripts,
			because they will be deleted automatically when the namespace
			goes away after Apache finishes handling the request.
	      <note>
		    One current problem with this system is that while
		    variables are garbage collected, file handles are not, so
		    that it is very important that Rivet script authors make
		    sure to close all the files they open.
	      </note>
      </para>
      <para>
	    	After a script has been loaded and parsed into it's "pure Tcl"
	    	form, it is also cached, so that it may be used in the future
	    	without having to reload it (and re-parse it) from the disk.
	    	The number of scripts stored in memory is configurable.  This
	    	feature can significantly improve performance.
      </para>
    </section>
    <section>
        <title>Extending Rivet by developing C procedures implementing new commands</title>
        <para>
            Rivet endows the Tcl interpreter with new commands
            serving as interface between the application layer and the
            Apache web server. Many of these commands
            are meaningful only when a HTTP request is under way and 
            therefore a request_rec object allocated by the framework 
            is existing and was passed to mod_rivet as argument of a callback. 
            In case commands have to gain access to a valid request_rec
            object the C procedure must check if such 
            a pointer exists and it's initialized
            with valid data. For this purpose the procedure handling requests 
            (Rivet_SendContent) makes a copy of such pointer and keeps it
            in an internal structure. The copy is set to NULL just before
            returning to the framework, right after mod_rivet's has
            carried out its request processing. When the pointer copy is NULL 
            the module is outside any request processing and this
            condition invalidates the execution of
            many of the Rivet commands. In case they are called  
            (for example in a ChildInitScript, GlobalInitScript, 
            ServerInitScript or ChildExitScript) they fail with a Tcl error 
            you can handle with a <command>catch</command> command.
        </para>
        <para>            
            For this purpose in <option>src/rivet.h</option> the macro
            CHECK_REQUEST_REC was defined accepting two arguments: the copy
            to the request_rec pointer (stored in the 
            <structname>rivet_interp_globals</structname>
            structure) and the command name. If the pointer is NULL
            the macro calls Tcl_NoRequestRec and returns TCL_ERROR
            causing the command to fail. These are the step to follow
            to implement a new C language command for mod_rivet 
        </para>
        <itemizedlist>
            <listitem>
                Define the command and associated C language procedure
                in src/rivetcmds/rivetCore.c using the macro
                <option>RIVET_OBJ_CMD</option>
                <programlisting>RIVET_OBJ_CMD("mycmd",Rivet_MyCmd)</programlisting>
                This macro ensures the command is defined as <command>::rivet::mycmd</command>
            </listitem>
            <listitem>
                Add the code of Rivet_MyCmd to src/rivetcmd/rivetCore.c (in case
                the code resides in a different file also src/Makefile.am should be
                changed to tell the build system how to compile the code and
                link it into mod_rivet.so)
            </listitem>
            <listitem>
                If the code must gain access to <command>globals->r</command>
                put add the macro testing for the pointer
                <programlisting>TCL_CMD_HEADER( Rivet_MyCmd )
{
    rivet_interp_globals *globals = Tcl_GetAssocData( interp, "rivet", NULL );
    ....
    CHECK_REQUEST_REC(globals->r,"::rivet::mycmd");
    ...   
}</programlisting>
            </listitem>
            <listitem>
                Add a test for this command in <option>tests/checkfails.tcl</option>. For 
                instance
                <programlisting>...
check_fail no_body
check_fail virtual_filename unkn
check_fail my_cmd &lt;arg1&gt; &lt;arg2&gt;
....</programlisting>
                Where <option>&lt;arg1&gt; &lt;arg2&gt;</option> are optional 
                arguments in case the command needs to check for <command>globals->r</command>
                in special cases. Then, if <command>::rivet::mycmd</command> must fail also
                <option>tests/failtest.tcl</option> should modified as
                <programlisting>virtual_filename->1
mycmd->1</programlisting>
                The value associated to the test must be <option>0</option> in case the
                command doesn't need to test the <command>globals->r</command> pointer.
            </listitem>
            
        </itemizedlist>
    </section>
    <section>
      <title>Debugging Rivet and Apache</title>
      <para>
	If you are interested in hacking on Rivet, you're welcome to
	contribute!  Invariably, when working with code, things go
	wrong, and it's necessary to do some debugging.  In a server
	environment like Apache, it can be a bit more difficult to
	find the right way to do this.  Here are some techniques to
	try.
      </para>
      <para>
	The first thing you should know is that Apache can be launched
	as a <emphasis>single process</emphasis> with the
	<option>-X</option> argument:</para>
      <programlisting>httpd -X</programlisting>.
      <para>
	On Linux, one of the first things to try is the system call
	tracer, <command>strace</command>.  You don't even have to
	recompile Rivet or Apache for this to work.
      </para>

      <programlisting>strace -o /tmp/outputfile -S 1000 httpd -X</programlisting>

      <para>
      	This command will run httpd in the system call tracer,
			which leaves its output (there is potentially a lot of it) in
			<filename>/tmp/outputfile</filename>.  The <option>-S</option>
			option tells <command></command>strace to only record the
			first 1000 bytes of a syscall.  Some calls such as
			<function>write</function> can potentially be much longer than
			this, so you may want to increase this number.  The results
			are a list of all the system calls made by the program.  You
			want to look at the end, where the failure presumably occured,
			to see if you can find anything that looks like an error.  If
			you're not sure what to make of the results, you can always
			ask on the Rivet development mailing list.
      </para>

      <para>
			If <command>strace</command> (or its equivalent on your
			operating system) doesn't answer your question, it may be time
			to debug Apache and Rivet.  To do this, you will need to rebuild mod_rivet.
			First of all you have to configure the build by running the
			<command>./configure</command> script with the
			<option>-enable-symbols</option> option and after you have
			set the CFLAGS and LDFLAGS environment variables
      </para>
      <programlisting>export CFLAGS="-g -O0"
export LDFLAGS="-g"
./configure --enable-symbols ......
make
make install</programlisting>
		<para>
			Arguments to <command>./configure</command> must fit your Apache HTTP
			web server installation. See the output produced by
		</para>
		<programlisting>./configure --help</programlisting>
		<para>
			And check the <xref linkend="installation">installation</xref> page to
			have further information.
			Since it's easier to debug a single process, we'll still run
			Apache in single process mode with -X:
      </para>

      <programlisting>
@ashland [~] $ gdb /usr/sbin/apache.dbg
GNU gdb 5.3-debian
Copyright 2002 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for details.
This GDB was configured as "powerpc-linux"...
(gdb) run -X
Starting program: /usr/sbin/apache.dbg -X
[New Thread 16384 (LWP 13598)]
.
.
.
      </programlisting>

      <para>
	When your apache session is up and running, you can request a
	web page with the browser, and see where things go wrong (if
	you are dealing with a crash, for instance).
      </para>

    </section>

  </section>