File: http-authentication.html

package info (click to toggle)
nodebox-web 1.9.2-2
  • links: PTS
  • area: main
  • in suites: lenny
  • size: 1,724 kB
  • ctags: 1,254
  • sloc: python: 6,161; sh: 602; xml: 239; makefile: 33
file content (127 lines) | stat: -rw-r--r-- 12,649 bytes parent folder | download | duplicates (5)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd">
<html lang="en">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
<title>Password-Protected Feeds [Universal Feed Parser]</title>
<link rel="stylesheet" href="feedparser.css" type="text/css">
<link rev="made" href="mailto:mark@diveintomark.org">
<meta name="generator" content="DocBook XSL Stylesheets V1.65.1">
<meta name="keywords" content="RSS, Atom, CDF, XML, feed, parser, Python">
<link rel="start" href="index.html" title="Documentation">
<link rel="up" href="http.html" title="HTTP Features">
<link rel="prev" href="http-redirect.html" title="HTTP Redirects">
<link rel="next" href="http-other.html" title="Other HTTP Headers">
</head>
<body id="feedparser-org" class="docs">
<div class="z" id="intro"><div class="sectionInner"><div class="sectionInner2">
<div class="s" id="pageHeader">
<h1><a href="/"><span>Universal Feed Parser</span></a></h1>
<p><span>Parse RSS and Atom feeds in Python.  3000 unit tests.  Open source.</span></p>
</div>
<div class="s" id="quickSummary"><ul>
<li class="li1">
<a href="http://sourceforge.net/projects/feedparser/"><span>Download</span></a> ·</li>
<li class="li2">
<a href="http://feedparser.org/docs/"><span>Documentation</span></a> ·</li>
<li class="li3">
<a href="http://feedparser.org/tests/"><span>Unit tests</span></a> ·</li>
<li class="li4"><a href="http://sourceforge.net/tracker/?func=browse&amp;group_id=112328&amp;atid=661937"><span>Report a bug</span></a></li>
</ul></div>
</div></div></div>
<div id="main"><div id="mainInner">
<p id="breadcrumb">You are here: <a href="index.html">Documentation</a> → <a href="http.html">HTTP Features</a> → <span class="thispage">Password-Protected Feeds</span></p>
<div class="section" lang="en">
<div class="titlepage">
<div>
<div><h2 class="title">
<a name="http.auth" class="skip" href="#http.auth" title="link to this section"><img src="images/permalink.gif" alt="[link]" title="link to this section" width="8" height="9"></a> Password-Protected Feeds</h2></div>
<div><div class="abstract">
<h3 class="title"></h3>
<p><span class="application">Universal Feed Parser</span> supports downloading and parsing password-protected feeds that are protected by <acronym title="Hypertext Transfer Protocol">HTTP</acronym> authentication.  Both basic and digest authentication are supported.</p>
</div></div>
</div>
<div></div>
</div>
<p>The easiest way is to embed the username and password in the feed <acronym title="Uniform Resource Locator">URL</acronym> itself.</p>
<div class="example">
<a name="example.auth.inline" class="skip" href="#example.auth.inline" title="link to this example"><img src="images/permalink.gif" alt="[link]" title="link to this example" width="8" height="9"></a> <h3 class="title">Example: Downloading a feed protected by basic authentication (the easy way)</h3>
<p>In this example, the username is <tt class="literal">test</tt> and the password is <tt class="literal">basic</tt>.</p>
<pre class="screen"><tt class="prompt">&gt;&gt;&gt; </tt><span class="userinput"><font color='navy'><b>import</b></font> feedparser</span>
<tt class="prompt">&gt;&gt;&gt; </tt><span class="userinput">d = feedparser.parse('<a href="http://feedparser.org/docs/examples/basic_auth.xml">http://test:basic@feedparser.org/docs/examples/basic_auth.xml</a>')</span>
<tt class="prompt">&gt;&gt;&gt; </tt><span class="userinput">d.feed.title</span>
<span class="computeroutput">u'Sample Feed'</span></pre>
</div>
<p>The same technique works for digest authentication.  (Technically, <span class="application">Universal Feed Parser</span> will attempt basic authentication first, but if that fails and the server indicates that it requires digest authentication, <span class="application">Universal Feed Parser</span> will automatically re-request the feed with the appropriate digest authentication headers.  <span class="emphasis"><em>This means that this technique will send your password to the server in an easily decryptable form.</em></span>)</p>
<div class="example">
<a name="example.auth.inline.digest" class="skip" href="#example.auth.inline.digest" title="link to this example"><img src="images/permalink.gif" alt="[link]" title="link to this example" width="8" height="9"></a> <h3 class="title">Example: Downloading a feed protected by digest authentication (the easy but horribly insecure way)</h3>
<p>In this example, the username is <tt class="literal">test</tt> and the password is <tt class="literal">digest</tt>.</p>
<pre class="screen"><tt class="prompt">&gt;&gt;&gt; </tt><span class="userinput"><font color='navy'><b>import</b></font> feedparser</span>
<tt class="prompt">&gt;&gt;&gt; </tt><span class="userinput">d = feedparser.parse('<a href="http://feedparser.org/docs/examples/digest_auth.xml">http://test:digest@feedparser.org/docs/examples/digest_auth.xml</a>')</span>
<tt class="prompt">&gt;&gt;&gt; </tt><span class="userinput">d.feed.title</span>
<span class="computeroutput">u'Sample Feed'</span></pre>
</div>
<p>You can also construct a <tt class="classname">HTTPBasicAuthHandler</tt> that contains the password information, then pass that as a handler to the <tt class="function">parse</tt> function.  <tt class="classname">HTTPBasicAuthHandler</tt> is part of the standard <a href="http://docs.python.org/lib/module-urllib2.html"><tt class="filename">urllib2</tt></a> module.</p>
<div class="example">
<a name="example.auth.basic" class="skip" href="#example.auth.basic" title="link to this example"><img src="images/permalink.gif" alt="[link]" title="link to this example" width="8" height="9"></a> <h3 class="title">Example: Downloading a feed protected by <acronym title="Hypertext Transfer Protocol">HTTP</acronym> basic authentication (the hard way)</h3>
<pre class="programlisting python"><font color='navy'><b>import</b></font> urllib2, feedparser

<font color='green'><i># Construct the authentication handler</i></font>
auth = urllib2.HTTPBasicAuthHandler()

<font color='green'><i># Add password information: realm, host, user, password.</i></font>
<font color='green'><i># A single handler can contain passwords for multiple sites;</i></font>
<font color='green'><i># urllib2 will sort out which passwords get sent to which sites</i></font>
<font color='green'><i># based on the realm and host of the URL you're retrieving</i></font>
auth.add_password(<font color='olive'>'BasicTest'</font>, <font color='olive'>'feedparser.org'</font>, <font color='olive'>'test'</font>, <font color='olive'>'basic'</font>)

<font color='green'><i># Pass the authentication handler to the feed parser.</i></font>
<font color='green'><i># handlers is a list because there might be more than one</i></font>
<font color='green'><i># type of handler (urllib2 defines lots of different ones,</i></font>
<font color='green'><i># and you can build your own)</i></font>
d = feedparser.parse('<a href="http://feedparser.org/docs/examples/basic_auth.xml">http://feedparser.org/docs/examples/basic_auth.xml</a>', \
                     handlers=[auth])</pre>
</div>
<p>Digest authentication is handled in much the same way, by constructing an <tt class="classname">HTTPDigestAuthHandler</tt> and populating it with the necessary realm, host, user, and password information.  This is more secure than <a href="http-authentication.html#example.auth.inline.digest" title="Example: Downloading a feed protected by digest authentication (the easy but horribly insecure way)">stuffing the username and password in the URL</a>, since the password will be encrypted before being sent to the server.</p>
<div class="example">
<a name="example.auth.digest" class="skip" href="#example.auth.digest" title="link to this example"><img src="images/permalink.gif" alt="[link]" title="link to this example" width="8" height="9"></a> <h3 class="title">Example: Downloading a feed protected by <acronym title="Hypertext Transfer Protocol">HTTP</acronym> digest authentication (the secure way)</h3>
<pre class="programlisting python"><font color='navy'><b>import</b></font> urllib2, feedparser

auth = urllib2.HTTPDigestAuthHandler()
auth.add_password(<font color='olive'>'DigestTest'</font>, <font color='olive'>'feedparser.org'</font>, <font color='olive'>'test'</font>, <font color='olive'>'digest'</font>)
d = feedparser.parse('<a href="http://feedparser.org/docs/examples/digest_auth.xml">http://feedparser.org/docs/examples/digest_auth.xml</a>', \
                     handlers=[auth])</pre>
</div>
<a name="caution.urllib2.buggy.as.all.hell"></a><table class="caution" border="0" summary="">
<tr><td rowspan="2" align="center" valign="top" width="1%"><img src="images/caution.png" alt="Caution" title="" width="24" height="24"></td></tr>
<tr><td colspan="2" align="left" valign="top" width="99%">Prior to <span class="application">Python</span> 2.3.3, <tt class="filename">urllib2</tt> did not properly support digest authentication.  Mac OS X 10.3 “<span class="quote">Panther</span>” ships with <span class="application">Python</span> 2.3.  Users of previous versions of Mac OS X will need to upgrade to the latest version of <span class="application">Python</span> in order to use digest authentication.</td></tr>
</table>
<p>The examples so far have assumed that you know in advance that the feed is password-protected.  But what if you don't know?</p>
<p>If you try to download a password-protected feed without sending all the proper password information, the server will return an <acronym title="Hypertext Transfer Protocol">HTTP</acronym> status code <tt class="constant">401</tt>.  <span class="application">Universal Feed Parser</span> makes this status code available in <tt class="varname">d.status</tt>.</p>
<p>Details on the authentication scheme are in <tt class="varname">d.headers['www-authenticate']</tt>.  <span class="application">Universal Feed Parser</span> does not do any further parsing on this field; you will need to parse it yourself.  Everything before the first space is the type of authentication (probably <tt class="constant">Basic</tt> or <tt class="constant">Digest</tt>), which controls which type of handler you'll need to construct.  The realm name is given as <tt class="literal">realm="foo"</tt> -- so <tt class="literal">foo</tt> would be your first argument to <tt class="methodname">auth.add_password</tt>.  Other information in the <tt class="literal">www-authenticate</tt> header is probably safe to ignore; the <tt class="filename">urllib2</tt> module will handle it for you.</p>
<div class="example">
<a name="example.auth.required" class="skip" href="#example.auth.required" title="link to this example"><img src="images/permalink.gif" alt="[link]" title="link to this example" width="8" height="9"></a> <h3 class="title">Example: Determining that a feed is password-protected</h3>
<pre class="screen"><tt class="prompt">&gt;&gt;&gt; </tt><span class="userinput"><font color='navy'><b>import</b></font> feedparser</span>
<tt class="prompt">&gt;&gt;&gt; </tt><span class="userinput">d = feedparser.parse('<a href="http://feedparser.org/docs/examples/digest_auth.xml">http://feedparser.org/docs/examples/digest_auth.xml</a>')</span>
<tt class="prompt">&gt;&gt;&gt; </tt><span class="userinput">d.status</span>
<span class="computeroutput">401</span>
<tt class="prompt">&gt;&gt;&gt; </tt><span class="userinput">d.headers[<font color='olive'>'www-authenticate'</font>]</span>
<span class="computeroutput">'Basic realm="Use test/basic"'</span>
<tt class="prompt">&gt;&gt;&gt; </tt><span class="userinput">d = feedparser.parse('<a href="http://feedparser.org/docs/examples/digest_auth.xml">http://feedparser.org/docs/examples/digest_auth.xml</a>')</span>
<tt class="prompt">&gt;&gt;&gt; </tt><span class="userinput">d.status</span>
<span class="computeroutput">401</span>
<tt class="prompt">&gt;&gt;&gt; </tt><span class="userinput">d.headers[<font color='olive'>'www-authenticate'</font>]</span>
<span class="computeroutput">'Digest realm="DigestTest",
 nonce="+LV/uLLdAwA=5d77397291261b9ef256b034e19bcb94f5b7992a",
 algorithm=MD5,
 qop="auth"'</span></pre>
</div>
</div>
<div style="float: left">← <a class="NavigationArrow" href="http-redirect.html">HTTP Redirects</a>
</div>
<div style="text-align: right">
<a class="NavigationArrow" href="http-other.html">Other HTTP Headers</a> →</div>
<hr style="clear:both">
<div class="footer"><p class="copyright">Copyright © 2004, 2005, 2006 Mark Pilgrim</p></div>
</div></div>
</body>
</html>