1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354
|
<!--#set var="title" value="Documentation of the Programmatic Interface (API) to The W3C Markup Validation Service"
--><!--#set var="relroot" value="../"
--><!--#include virtual="../header.html" -->
<div class="doc">
<h2>Markup Validator Web Service API<br />
SOAP 1.2 validation interface documentation</h2>
<p>Interface applications with the Markup Validator through its <strong>experimental</strong> API. This is version 0.2, dated May 2007. For a history of the format, see <a href="#changelog">Change Log</a>.</p>
<p><strong>Note</strong>: Please be considerate in using this shared, free resource.
Consider <a href="install.html">Installing your own instance of the validator</a>
for smooth and fast operation. Excessive use of the W3C Validation Service
will be blocked.</p>
<h3 id="TableOfContents">Table of Contents</h3>
<div id="toc">
<ul>
<li><a href="#requestformat">Validation Request Format</a></li>
<li><a href="#soap12format">SOAP format description</a>
<ul>
<li><a href="#soap12_sample">sample SOAP 1.2 validation response</a></li>
<li><a href="#soap12response">SOAP1.2 response format reference</a></li>
<li><a href="#soap12message">SOAP1.2 atomic message (error or warning) format reference</a></li>
<li><a href="#changelog">Change Log</a></li>
</ul>
</li>
<li><a href="#libs">Libraries</a></li>
<li><a href="#http_headers">Using HTTP headers to know validation results</a></li>
</ul>
</div>
<p id="skip"></p>
<h3 id="requestformat">Validation Request Format</h3>
<p>
Below is a table of the parameter you can use to send a query to the W3C
Markup Validator. All parameter values except data in
<code>uploaded_file</code> are expected to be encoded in the UTF-8 character
encoding.
</p>
<p>If you want to use W3C's public validation server, use the parameters below
in conjunction with the following base URI:<br />
<kbd>http://validator.w3.org/check</kbd> <br />
(replace with the address of your own server if you want to call a private instance of the validator)</p>
<p><strong>Note</strong>: If you wish to call the validator programmatically for a batch of documents,
please make sure that your script will <code>sleep</code> for <strong>at least 1 second</strong>
between requests. The Markup Validation service is a free, public service for all, your respect
is appreciated. thanks.</p>
<table class="refdoc">
<tr>
<th>Parameter</th><th>Description</th><th>Default value</th>
</tr>
<tr>
<th>uri</th>
<td>The <acronym title="Universal Resource Locator">URL</acronym> of the document to validate</td>
<td>None, but either this parameter, or <code>uploaded_file</code>,
or <code>fragment</code> must be given.</td>
</tr>
<tr>
<th>uploaded_file</th>
<td>The document to validate, POSTed as multipart/form-data</td>
<td>None, but either this parameter, or <code>uri</code>,
or <code>fragment</code> must be given.</td>
</tr>
<tr>
<th>fragment</th>
<td>The source of the document to validate. Full documents only.</td>
<td>None, but either this parameter, or <code>uri</code>,
or <code>uploaded_file</code> must be given.</td>
</tr>
<tr>
<th>output</th>
<td>triggers the various outputs formats of the validator. If unset, the usual
Web format will be sent. If set to <code>soap12</code>, the SOAP1.2 interface will
be triggered. See <a href="#soap12format">below for the SOAP 1.2 response format description</a>.</td>
<td>unset</td>
</tr>
<tr>
<th>charset</th>
<td>Character encoding override:
Specify the character encoding to use when parsing the document. When used with
the auxiliary parameter <code>fbc</code> set to 1, the given encoding will only be used as
a fallback value, in case the charset is absent or unrecognized. Note that this parameter
is ignored if validating a <code>fragment</code> with the direct input interface.</td>
<td>None, by default the validator detects the charset of the document automatically.</td>
</tr>
<tr>
<th>doctype</th>
<td>Document Type override:
Specify the Document Type (DOCTYPE) to use when parsing the document. When used
with the auxiliary parameter <code>fbd</code> set to 1, the given document type will only be used
as a fallback value, in case the document's DOCTYPE declaration is missing or unrecognized.</td>
<td>None, by default the validator detects the document type of the document automatically.</td>
</tr>
<tr>
<th>verbose</th>
<td>In the web interface, when set to 1, will make error messages, explanations
and other diagnostics more verbose. In SOAP output, does not have any impact.</td>
<td>0 (unset)</td>
</tr>
<tr>
<th>debug</th>
<td>When set to 1, will output some extra debugging information on the validated resource (such as HTTP headers) and validation process (such as parser used, parse mode etc.). In the SOAP output, this information will be given in <m:debug> elements.</td>
<td>0 (unset)</td>
</tr>
<tr>
<th>ss</th>
<td> as <em>show source</em>. In the web interface, triggers the display of the source
after the validation results. In SOAP output, does not have any impact.</td>
<td>0 (unset)</td>
</tr>
<tr>
<th>outline</th>
<td>In the web interface, when set to 1, triggers the display of the document outline
after the validation results. In SOAP output, does not have any impact.</td>
<td>0 (unset)</td>
</tr>
</table>
<h3 id="soap12format">SOAP format description</h3>
<p>When called with parameter <code>output=soap12</code>, the validator will switch
to its SOAP 1.2 interface (experimental for now). Below is a sample response, as well as
a description of the most important elements of the response.</p>
<h4 id="soap12_sample">sample SOAP 1.2 validation response</h4>
<p>A SOAP response for the validation of a document (invalid) will look like this:</p>
<pre style="font-size: smaller">
<?xml version="1.0" encoding="UTF-8"?>
<env:Envelope xmlns:env="http://www.w3.org/2003/05/soap-envelope">
<env:Body>
<<a href="#soap12_markupvalidationresponse">m:markupvalidationresponse</a>
env:encodingStyle="http://www.w3.org/2003/05/soap-encoding"
xmlns:m="http://www.w3.org/2005/10/markup-validator">
<<a href="#soap12_uri">m:uri</a>>http://qa-dev.w3.org/wmvs/HEAD/dev/tests/xhtml1-bogus-element.html</m:uri>
<<a href="#soap12_checkedby">m:checkedby</a>>http://validator.w3.org/</m:checkedby>
<<a href="#soap12_doctype">m:doctype</a>>-//W3C//DTD XHTML 1.0 Transitional//EN</m:doctype>
<<a href="#soap12_charset">m:charset</a>>utf-8</m:charset>
<<a href="#soap12_validity">m:validity</a>>false</m:validity>
<<a href="#soap12_errors">m:errors</a>>
<<a href="#soap12_errorcount">m:errorcount</a>>1</m:errorcount>
<<a href="#soap12_errorlist">m:errorlist</a>>
<<a href="#soap12_error">m:error</a>>
<<a href="#soap12_line">m:line</a>>13</m:line>
<<a href="#soap12_col">m:col</a>>6</m:col>
<<a href="#soap12_source">m:source</a>>
<![CDATA[
&#60;foo<strong title="Position where error was detected.">&#62;</strong>This phrase is enclosed in a bogus FOO element.&#60;/foo&#62;
]]>
</m:source>
<<a href="#soap12_explanation">m:explanation</a>>
<![CDATA[
<p> ... </p<p>
]]>
</m:explanation>
<<a href="#soap12_messageid">m:messageid</a>>76</m:messageid>
<<a href="#soap12_message">m:message</a>>element "foo" undefined</m:message>
</m:error>
</m:errorlist>
</m:errors>
<m:warnings>
<m:warningcount>0</m:warningcount>
<m:warninglist>
</m:warninglist>
</m:warnings>
</m:markupvalidationresponse>
</env:Body>
</env:Envelope>
</pre>
<h4 id="soap12response">SOAP1.2 response format reference</h4>
<table class="refdoc">
<tr><th>element</th><th>description</th></tr>
<tr>
<th id="soap12_markupvalidationresponse">markupvalidationresponse</th>
<td>The main element of the validation response. Encloses all other information about the validation results.</td>
</tr>
<tr>
<th id="soap12_uri">uri</th>
<td>the address of the document validated. Will (likely?) be <kbd>upload://Form Submission</kbd>
if an uploaded document or fragment was validated.
In <a href="http://www.w3.org/WAI/ER/">EARL</a> terms, this is the <kbd>TestSubject</kbd>.
</td>
</tr>
<tr>
<th id="soap12_checkedby">checkedby</th>
<td>Location of the service which provided the validation result.
In <a href="http://www.w3.org/WAI/ER/">EARL</a> terms, this is the <kbd>Assertor</kbd>.
</td>
</tr>
<tr>
<th id="soap12_doctype">doctype</th>
<td>Detected (or forced) Document Type for the validated document</td>
</tr>
<tr>
<th id="soap12_charset">charset</th>
<td>Detected (or forced) Character Encoding for the validated document</td>
</tr>
<tr>
<th id="soap12_validity">validity</th>
<td>Whether or not the document validated passed or not formal validation (true|false boolean)</td>
</tr>
<tr>
<th id="soap12_errors">errors</th>
<td>Encapsulates all data about errors encountered through the validation process</td>
</tr>
<tr>
<th id="soap12_errorcount">errorcount</th>
<td>a child of <a href="#soap12_errors">errors</a>, counts the number of errors listed</td>
</tr>
<tr>
<th id="soap12_errorlist">errorlist</th>
<td>a child of <a href="#soap12_errors">errors</a>, contains the list of errors (surprise!)</td>
</tr>
<tr>
<th id="soap12_error">error</th>
<td>a child of <a href="#soap12_errorlist">errorlist</a>, contains the information on a single
validation error. </td>
</tr>
</table>
<p><strong>Note</strong>: <code>warnings</code>, <code>warningcount</code>,
<code>warninglist</code> and <code>warning</code> are similar to, respectively,
<code><a href="#soap12_errors">errors</a></code>,
<code><a href="#soap12_errorcount">errorcount</a></code>,
<code><a href="#soap12_errorlist">errorlist</a></code> and
<code><a href="#soap12_error">error</a></code>.
</p>
<h4 id="soap12message">SOAP1.2 atomic message (error or warning) format reference</h4>
<p>As seen as the example above, the children of the <code><a href="#soap12_error">error</a></code>
element, but also the <code>warning</code> element are <code>line</code>, <code>col</code> and
<code>message</code>, defined below:</p>
<table class="refdoc">
<tr><th>element</th><th>description</th></tr>
<tr>
<th id="soap12_line">line</th>
<td>Within the source code of the validated document, refers to the line where the error was detected.</td>
</tr>
<tr>
<th id="soap12_col">col</th>
<td>Within the source code of the validated document, refers to the column of the line where the error was detected.</td>
</tr>
<tr>
<th id="soap12_message">message</th>
<td>The actual error message</td>
</tr>
<tr>
<th id="soap12_messageid">messageid</th>
<td>The number/identifier of the error, as addressed internally by the validator</td>
</tr>
<tr>
<th id="soap12_explanation">explanation</th>
<td>Explanation for the error. Given as HTML fragment within CDATA block.</td>
</tr>
<tr>
<th id="soap12_source">source</th>
<td>Snippet of the source where the error was found. Given as HTML fragment within CDATA block.</td>
</tr>
</table>
<h4 id="changelog">Change Log</h4>
<p>Up to version 0.2, all changes are backward-compatible.</p>
<dl>
<dt>v 0.2 (June 2007)</dt>
<dd><ul>
<li><code>debug</code> parameter now has an effect on both HTML and SOAP outputs</li>
<li><code>messageid</code> is now implemented</li>
<li>added <code>source</code> and <code>explanation</code> elements.</li>
</ul>
</dd>
<dt>v 0.1</dt>
<dd><p>Initial revision</p></dd>
</dl>
<h3 id="libs">Libraries</h3>
<p>Building of libraries used to interact with the validator's API is <a href="http://www.w3.org/QA/2006/10/validator_api.html">encouraged</a>. If you are the
maintainer of such a library, <a href="../feedback.html">contact
us</a> and we will list it here.</p>
<p>W3C has not reviewed, verified nor endorses these
implementations.</p>
<h4>Known libraries for the W3C Markup Validator API</h4>
<ul>
<li><a href="http://search.cpan.org/dist/WebService-Validator-HTML-W3C/">WebService::Validator::HTML::W3C</a> in Perl,
by Struan Donald.</li>
<li><a href="http://pear.php.net/package/Services_W3C_HTMLValidator" title="PEAR Package : Services_W3C_HTMLValidator">Services_W3C_HTMLValidator</a> in PHP, a PEAR library by Brett Bieber.</li>
<li><a href="http://www.clickfind.com.au/developers-directory/code.cfm">ColdFusion (MX7) class</a> by the clickfind team.</li>
<li><a href="http://sourceforge.net/projects/w3cmarkupvalida/">C# library</a> by María Eugenia Fernández Menéndez.</li>
<li><a href="http://www.rexsl.com/rexsl-w3c/">Java client</a> by the ReXSL.com team.</li>
<li><a href="http://pypi.python.org/pypi/py_w3c/">Python library</a> by Kazbek Byasov.</li>
</ul>
<h3 id="http_headers">Using HTTP headers to know validation results</h3>
<p>Every validation result is served via the HTTP protocol, with custom headers giving a simple, quick way
to get validation results without having to parse the results body. This is a simple (but poorer) alternative to using
the full API described above.</p>
<p>The HTTP headers for a validation results page will generally look like:</p>
<pre>
HEAD 'http://validator.localhost/check?uri=http%3A%2F%2Fwww.w3.org'
200 OK
[...]
Content-Language: en
Content-Type: text/html; charset=utf-8
X-W3C-Validator-Errors: 0
X-W3C-Validator-Warnings: 0
X-W3C-Validator-Recursion: 1
X-W3C-Validator-Status: Valid
</pre>
<p>The headers and their values are as follows:</p>
<table>
<tr><th>Header</th><th>Value</th><th>Notes</th></tr>
<tr>
<td>X-W3C-Validator-Status</td>
<td><code>Valid</code> or <code>Invalid</code> if validation was performed.<br />
value will be <code>Abort</code> if a fatal error (decoding, 404 not found, etc)
was encountered and validation could not be performed</td>
<td>Note: <code>Abort</code> value was added in version 0.8.0</td>
</tr>
<tr>
<td>X-W3C-Validator-Errors</td>
<td>Number of Errors found during validation. <code>0</code> if no errors found.</td>
<td>0 does not necessarily mean "valid" (it may mean that validation could not be performed)</td>
</tr>
<tr>
<td>X-W3C-Validator-Warnings</td>
<td>Number of Warnings found during validation. <code>0</code> if no errors found.</td>
<td>The warnings include validation warning, as well as pre-parsing warnings (such as character encoding mismatch, doctype override, etc.)</td>
</tr>
<tr>
<td>X-W3C-Validator-Recursion</td>
<td>Integer. Generally, <code>1</code>. More if recursively validating validation results.
</td>
<td> The validator will use this in conjunction with its <code>Max Recursion</code> setup to avoid
abusive recursion (Denial of Service attack).</td>
</tr>
</table>
</div>
<!--#include virtual="../footer.html" -->
</body>
</html>
|