1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153
|
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
<title>Restrict parser network access</title>
<meta name="generator" content="DocBook XSL Stylesheets V1.76.1">
<link rel="home" href="index.html" title="Raptor RDF Syntax Library Manual">
<link rel="up" href="tutorial-parsing.html" title="Parsing syntaxes to RDF Triples">
<link rel="prev" href="tutorial-parser-content.html" title="Provide syntax content to parse">
<link rel="next" href="tutorial-parser-static-info.html" title="Querying parser static information">
<meta name="generator" content="GTK-Doc V1.18 (XML mode)">
<link rel="stylesheet" href="style.css" type="text/css">
</head>
<body bgcolor="white" text="black" link="#0000FF" vlink="#840084" alink="#0000FF">
<table class="navigation" id="top" width="100%" summary="Navigation header" cellpadding="2" cellspacing="2"><tr valign="middle">
<td><a accesskey="p" href="tutorial-parser-content.html"><img src="left.png" width="24" height="24" border="0" alt="Prev"></a></td>
<td><a accesskey="u" href="tutorial-parsing.html"><img src="up.png" width="24" height="24" border="0" alt="Up"></a></td>
<td><a accesskey="h" href="index.html"><img src="home.png" width="24" height="24" border="0" alt="Home"></a></td>
<th width="100%" align="center">Raptor RDF Syntax Library Manual</th>
<td><a accesskey="n" href="tutorial-parser-static-info.html"><img src="right.png" width="24" height="24" border="0" alt="Next"></a></td>
</tr></table>
<div class="section">
<div class="titlepage"><div><div><h2 class="title" style="clear: both">
<a name="restrict-parser-network-access"></a>Restrict parser network access</h2></div></div></div>
<p>
Parsing can cause network requests to be performed, especially
if a URI is given as an argument such as with
<a class="link" href="raptor2-section-parser.html#raptor-parser-parse-uri" title="raptor_parser_parse_uri ()"><code class="function">raptor_parser_parse_uri()</code></a>
however there may also be indirect requests such as with the
GRDDL parser that retrieves URIs depending on the results of
initial parse requests. The URIs requested may not be wanted
to be fetched or need to be filtered, and this can be done in
three ways.
</p>
<div class="section">
<div class="titlepage"><div><div><h3 class="title">
<a name="tutorial-filter-network-with-feature"></a>Filtering parser network requests with option <a class="link" href="raptor2-section-option.html#RAPTOR-OPTION-NO-NET:CAPS"><code class="literal">RAPTOR_OPTION_NO_NET</code></a>
</h3></div></div></div>
<p>
The parser option
<a class="link" href="raptor2-section-option.html#RAPTOR-OPTION-NO-NET:CAPS"><code class="literal">RAPTOR_OPTION_NO_NET</code></a>
can be set with
<a class="link" href="raptor2-section-parser.html#raptor-parser-set-option" title="raptor_parser_set_option ()"><code class="function">raptor_parser_set_option()</code></a>
and forbids all network requests. There is no customisation with
this approach, for that see the URI filter in the next section.
</p>
<pre class="programlisting">
rdf_parser = raptor_new_parser(world, "rdfxml");
/* Disable internal network requests */
raptor_parser_set_option(rdf_parser, RAPTOR_OPTION_NO_NET, NULL, 1);
</pre>
</div>
<div class="section">
<div class="titlepage"><div><div><h3 class="title">
<a name="tutorial-filter-network-www-uri-filter"></a>Filtering parser network requests with <a class="link" href="raptor2-section-www.html#raptor-www-set-uri-filter" title="raptor_www_set_uri_filter ()"><code class="function">raptor_www_set_uri_filter()</code></a>
</h3></div></div></div>
<p>
The
<a class="link" href="raptor2-section-www.html#raptor-www-set-uri-filter" title="raptor_www_set_uri_filter ()"><code class="function">raptor_www_set_uri_filter()</code></a>
allows setting of a filtering function to operate on all URIs
retrieved by a WWW connection. This connection can be used in
parsing when operated by hand.
</p>
<pre class="programlisting">
void write_bytes_handler(raptor_www* www, void *user_data,
const void *ptr, size_t size, size_t nmemb) {
{
raptor_parser* rdf_parser = (raptor_parser*)user_data;
raptor_parser_parse_chunk(rdf_parser, (unsigned char*)ptr, size*nmemb, 0);
}
int uri_filter(void* filter_user_data, raptor_uri* uri) {
/* return non-0 to forbid the request */
}
int main(int argc, char *argv[]) {
...
rdf_parser = raptor_new_parser(world, "rdfxml");
www = raptor_new_www(world);
/* filter all URI requests */
raptor_www_set_uri_filter(www, uri_filter, filter_user_data);
/* make WWW write bytes to parser */
raptor_www_set_write_bytes_handler(www, write_bytes_handler, rdf_parser);
raptor_parser_parse_start(rdf_parser, uri);
raptor_www_fetch(www, uri);
/* tell the parser that we are done */
raptor_parser_parse_chunk(rdf_parser, NULL, 0, 1);
raptor_free_www(www);
raptor_free_parser(rdf_parser);
...
}
</pre>
</div>
<div class="section">
<div class="titlepage"><div><div><h3 class="title">
<a name="tutorial-filter-network-parser-uri-filter"></a>Filtering parser network requests with <a class="link" href="raptor2-section-parser.html#raptor-parser-set-uri-filter" title="raptor_parser_set_uri_filter ()"><code class="function">raptor_parser_set_uri_filter()</code></a>
</h3></div></div></div>
<p>
The
<a class="link" href="raptor2-section-parser.html#raptor-parser-set-uri-filter" title="raptor_parser_set_uri_filter ()"><code class="function">raptor_parser_set_uri_filter()</code></a>
allows setting of a filtering function to operate on all URIs that
the parser sees. This operates on the internal raptor_www object
used inside parsing to retrieve URIs, similar to that described in
the <a class="link" href="restrict-parser-network-access.html#tutorial-filter-network-www-uri-filter" title="Filtering parser network requests with raptor_www_set_uri_filter()">previous section</a>.
</p>
<pre class="programlisting">
int uri_filter(void* filter_user_data, raptor_uri* uri) {
/* return non-0 to forbid the request */
}
rdf_parser = raptor_new_parser(world, "rdfxml");
raptor_parser_set_uri_filter(rdf_parser, uri_filter, filter_user_data);
/* parse content as normal */
raptor_parser_parse_uri(rdf_parser, uri, base_uri);
</pre>
</div>
<div class="section">
<div class="titlepage"><div><div><h3 class="title">
<a name="tutorial-filter-network-parser-timeout"></a>Setting timeout for parser network requests with option <a class="link" href="raptor2-section-option.html#RAPTOR-OPTION-WWW-TIMEOUT:CAPS"><code class="literal">RAPTOR_OPTION_WWW_TIMEOUT</code></a>
</h3></div></div></div>
<p>If the value of option
<a class="link" href="raptor2-section-option.html#RAPTOR-OPTION-WWW-TIMEOUT:CAPS"><code class="literal">RAPTOR_OPTION_WWW_TIMEOUT</code></a>
if set to a number >0, it is used as the timeout in seconds
for retrieving of URIs during parsing (primarily for GRDDL).
This uses
<a class="link" href="raptor2-section-www.html#raptor-www-set-connection-timeout" title="raptor_www_set_connection_timeout ()"><code class="function">raptor_www_set_connection_timeout()</code></a>
internally.
</p>
<pre class="programlisting">
rdf_parser = raptor_new_parser(world, "grddl");
/* set internal URI retrieval maximum time to 5 seconds */
raptor_parser_set_option(rdf_parser, RAPTOR_OPTION_WWW_TIMEOUT, NULL, 5);
</pre>
</div>
</div>
<div class="footer">
<hr>
Generated by GTK-Doc V1.18</div>
</body>
</html>
|