1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240
|
******************************************************************************
README - Netstring, string processing functions for the net
******************************************************************************
==============================================================================
Abstract
==============================================================================
Netstring is a collection of string processing functions that are useful in
conjunction with Internet messages and protocols. In particular, it contains
functions for the following purposes:
- Parsing MIME messages
- Several encoding/decoding functions (Base 64, Quoted Printable, Q,
URL-encoding)
- A new implementation of the CGI interface that allows users to upload files
- A simple HTML parser
- URL parsing, printing and processing
- Conversion between character sets
==============================================================================
Download
==============================================================================
You can download Netstring as gzip'ed tarball [1].
==============================================================================
Documentation
==============================================================================
Sorry, there is no manual. The mli files describe each function in detail.
Furthermore, the following additional information may be useful.
------------------------------------------------------------------------------
New CGI implementation
------------------------------------------------------------------------------
For a long time, the CGI implementation by Jean-Christophe Filliatre has been
the only freely available module that implemented the CGI interface (it also
based on code by Daniel de Rauglaudre). It worked well, but it did not support
file uploads because this requires a parser for MIME messages.
The main goal of Netstring is to realize such uploads, and because of this it
contains an almost complete parser for MIME messages.
The new CGI implementation provides the same functions than the old one, and
some extensions. If you call Cgi.parse_args(), you get the CGI parameters as
before, but as already explained this works also if the parameters are
encaspulated as MIME message. In the HTML code, you can select the MIME format
by using
<form action="..." method="post" enctype="multipart/form-data">
...
</form>
- this "enctype" attribute forces the browser to send the form parameters as
multipart MIME message (Note: You can neither send the parameters of a
conventional hyperlink as MIME message nor the form parameters if the "method"
is "get"). In many browsers only this particular encoding enables the file
upload elements, you cannot perform file uploads with other encodings.
As MIME messages can transport MIME types, filename, and other additional
properties, it is also possible to get these using the enhanced interface.
After calling
Cgi.parse_arguments config
you can get all available information about a certain parameter by invoking
let param = Cgi.argument "name"
- where "param" has the type "argument". There are several accessor functions
to extract the various aspects of arguments (name, filename, value by string,
value by temporary file, MIME type, MIME header) from "argument" values.
------------------------------------------------------------------------------
Base64, and other encodings
------------------------------------------------------------------------------
Netstring is also the successor of the Base64 package. It provides a Base64
compatible interface, and an enhanced API. The latter is contained in the
Netencoding module which also offers implementations of the "quoted printable",
"Q", and "URL" encodings. Please see netencoding.mli for details.
------------------------------------------------------------------------------
The MIME scanner functions
------------------------------------------------------------------------------
In the Mimestring module you can find several functions scanning parts of MIME
messages. These functions already cover most aspects of MIME messages: Scanning
of headers, analysis of structured header entries, and scanning of multipart
bodies. Of course, a full-featured MIME scanner would require some more
functions, especially concrete parsers for frequent structures (mail addresses
or date strings).
Please see the file mimestring.mli for details.
------------------------------------------------------------------------------
The HTML parser
------------------------------------------------------------------------------
The HTML parser should be able to read every HTML file; whether it is correct
or not. The parser tries to recover from parsing errors as much as possible.
The parser returns the HTML term as conventional recursive value (i.e. no
object-oriented design).
The parser has needs some knowledge about HTML which can be passed to it as a
simplified DTD. A DTD for HTML 4.0 is included into the module.
Please see the Nethtml module for details.
------------------------------------------------------------------------------
The abstract data type URL
------------------------------------------------------------------------------
The module Neturl contains support for URL parsing and processing. The
implementation follows strictly the standards RFC 1738 and RFC 1808. URLs can
be parsed, and several accessor functions allow the user to get components of
parsed URLs, or to change components. Modifying URLs is safe; it is impossible
to create a URL that does not have a valid string representation.
Both absolute and relative URLs are supported. It is possible to apply a
relative URL to a base URL in order to get the corresponding absolute URL.
------------------------------------------------------------------------------
Conversion between character sets and encodings
------------------------------------------------------------------------------
The module Netconversion converts strings from one characters set to another.
It is Unicode-based, and there are conversion tables for more than 50
encodings.
==============================================================================
Author, Copying
==============================================================================
Netstring has been written by Gerd Stolpmann [2]. You may copy it as you like,
you may use it even for commercial purposes as long as the license conditions
are respected, see the file LICENSE coming with the distribution. It allows
almost everything.
==============================================================================
History
==============================================================================
- Changed in 0.10.1: Removed the labels from Netstring_str for
O'Caml-3.03-alpha.
Important note: This release does not include the changes that have been
made to netstring by the ocamlnet project. It is only released to support
O'Caml 3.03-alpha immediately.
- Changed in 0.10: The CGI module has been extended. There is now a built-in
test bench; the module can be instantiated several times; improved
cache-control; support for cookies, and lots of smaller changes.
Neturl: Fixed apply_relative_path for a certain data case.
Netencoding.Url: New option ~plus.
Netdate: This module is new and contains functions to create date strings.
Nethtml: Added "essential blocks", i.e. elements that strictly require an
end tag.
Overall: The netstring package depends now on Unix.
- Changed in 0.9.8: Some fixes in Nethtml. There is now a relaxed HTML 4 DTD.
- Changed in 0.9.7: Fix: Mimestring
- Changed in 0.9.6: Nethtml.write omits end tags if end tags are forbidden.
- Changed in 0.9.5: Bugfixes and improvements in the HTML parser.
- Changed in 0.9.4: Improvements in the HTML parser. There is now a simplified
DTD which can much better represent the constraints of the DTD than the
previous list of empty elements. It should now parse every HTML 4.0 document
correctly even if end tags are omitted (where such omissions are allowed).
- Changed in 0.9.3: Fixed a bug in the "install" rule of the Makefile.
- Changed in 0.9.2: New format for the conversion tables which are now much
smaller.
- Changed in 0.9.1: Updated the Makefile such that (native-code) compilation
of netmappings.ml becomes possible.
- Changed in 0.9: Extended Mimestring module: It can now process RFC-2047
messages.
New Netconversion module which converts strings between character encodings.
- Changed in 0.8.1: Added the component url_accepts_8bits to
Neturl.url_syntax. This helps processing URLs which intentionally contain
bytes >= 0x80.
Fixed a bug: Every URL containing a 'j' was malformed!
- Changed in 0.8: Added the module Neturl which provides the abstract data
types of URLs.
The whole package is now thread-safe.
Added printers for the various opaque data types.
Added labels to function arguments where appropriate. The following
functions changed their signatures significantly: Cgi.mk_memory_arg,
Cgi.mk_file_arg.
- Changed in 0.7: Added workarounds for frequent browser bugs. Some functions
take now an additional argument specifying which workarounds are enabled.
- Changed in 0.6.1: Updated URLs in documentation.
- Changed in 0.6: The file upload has been re-implemented to support large
files; the file is now read block by block and the blocks can be collected
either in memory or in a temporary file.
Furthermore, the CGI API has been revised. There is now an opaque data type
"argument" that hides all implementation details and that is extensible (if
necessary, it is possible to add features without breaking the interface
again).
The CGI argument parser can be configured; currently it is possible to limit
the size of uploaded data, to control by which method arguments are
processed, and to set up where temporary files are created.
The other parts of the package that have nothing to do with CGI remain
unchanged.
- Changed in 0.5.1: A mistake in the documentation has been corrected.
- Initial version 0.5: The Netstring package wants to be the successor of the
Base64-0.2 and the Cgi-0.3 packages. The sum of both numbers is 0.5, and
because of this, the first version number is 0.5.
--------------------------
[1] see http://www.ocaml-programming.de/packages/netstring-0.10.tar.gz
[2] see mailto:gerd@gerd-stolpmann.de
|