File: NEWS

package info (click to toggle)
feedparser 5.1.2-1
  • links: PTS, VCS
  • area: main
  • in suites: wheezy
  • size: 10,764 kB
  • sloc: xml: 12,437; python: 4,289; makefile: 33
file content (361 lines) | stat: -rw-r--r-- 19,693 bytes parent folder | download
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
5.1.2 - May 3, 2012
    * Minor changes to the documentation
    * Strip potentially dangerous ENTITY declarations in encoded feeds
    * feedparser will now try to continue parsing despite compression errors
    * Fix issue 321 a little more (the initial fix missed a code path)
    * Issue 337 (`_parse_date_rfc822()` returns None on single-digit days)
    * Issue 343 (add magnet links to the ACCEPTABLE_URI_SCHEMES)
    * Issue 344 (handle deflated data with no headers nor checksums)
    * Issue 347 (support `itunes:image` elements with a `url` attribute)

5.1.1 - March 20, 2011
    * Fix mistakes, typos, and bugs in the unit test code
    * Fix crash in Python 2.4 and 2.5 if the feed has a UTF_32 byte order mark
    * Replace the RFC822 date parser for more extensibility
    * Issue 304 (handle RFC822 dates with timezones like GMT+00:00)
    * Issue 309 (itunes:keywords should be split by commas, not whitespace)
    * Issue 310 (pubDate should map to `published`, not `updated`)
    * Issue 313 (include the compression test files in MANIFEST.in)
    * Issue 314 (far-flung RFC822 dates don't throw OverflowError on x64)
    * Issue 315 (HTTP server for unit tests runs on 0.0.0.0)
    * Issue 321 (malformed URIs can cause ValueError to be thrown)
    * Issue 322 (HTTP redirect to HTTP 304 causes SAXParseException)
    * Issue 323 (installing chardet causes 11 unit test failures)
    * Issue 325 (map `description_detail` to `summary_detail`)
    * Issue 326 (Unicode filename causes UnicodeEncodeError if locale is ASCII)
    * Issue 327 (handle RFC822 dates with extraneous commas)
    * Issue 328 (temporarily map `updated` to `published` due to issue 310)
    * Issue 329 (escape backslashes in Windows path in docs/introduction.rst)
    * Issue 331 (don't escape backslashes that are in raw strings in the docs)

5.1 - December 2, 2011
    * Extensive, extensive unit test refactoring
    * Convert the Docbook documentation to ReST
    * Include the documentation in the source distribution
    * Consolidate the disparate README files into one
    * Support Jython somewhat (almost all unit tests pass)
    * Support Python 3.2
    * Fix Python 3 issues exposed by improved unit tests
    * Fix international domain name issues exposed by improved unit tests
    * Issue 148 (loose parser doesn't always return unicode strings)
    * Issue 204 (FeedParserDict behavior should not be controlled by `assert`)
    * Issue 247 (mssql date parser uses hardcoded tokyo timezone)
    * Issue 249 (KeyboardInterrupt and SystemExit exceptions being caught)
    * Issue 250 (`updated` can be a 9-tuple or a string, depending on context)
    * Issue 252 (running setup.py in Python 3 fails due to missing sgmllib)
    * Issue 253 (document that text/plain content isn't sanitized)
    * Issue 260 (Python 3 doesn't decompress gzip'ed or deflate'd content)
    * Issue 261 (popping from empty tag list)
    * Issue 262 (docs are missing from distribution files)
    * Issue 264 (vcard parser crashes on non-ascii characters)
    * Issue 265 (http header comparisons are case sensitive)
    * Issue 271 (monkey-patching sgmllib breaks other libraries)
    * Issue 272 (can't pass bytes or str to `parse()` in Python 3)
    * Issue 275 (`_parse_date()` doesn't catch OverflowError)
    * Issue 276 (mutable types used as default values in `parse()`)
    * Issue 277 (`python3 setup.py install` fails)
    * Issue 281 (`_parse_date()` doesn't catch ValueError)
    * Issue 282 (`_parse_date()` crashes when passed `None`)
    * Issue 285 (crash on empty xmlns attribute)
    * Issue 286 ('apos' character entity not handled properly)
    * Issue 289 (add an option to disable microformat parsing)
    * Issue 290 (Blogger's invalid img tags are unparseable)
    * Issue 292 (atom id element not explicitly supported)
    * Issue 294 ('categories' key exists but raises KeyError)
    * Issue 297 (unresolvable external doctype causes crash)
    * Issue 298 (nested nodes clobber actual values)
    * Issue 300 (performance improvements)
    * Issue 303 (unicode characters cause crash during relative uri resolution)
    * Remove "Hot RSS" support since the format doesn't actually exist
    * Remove the old feedparser.org website files from the source
    * Remove the feedparser command line interface
    * Remove the Zope interoperability hack
    * Remove extraneous whitespace

5.0.1 - February 20, 2011
    * Fix issue 91 (invalid text in XML declaration causes sanitizer to crash)
    * Fix issue 254 (sanitization can be bypassed by malformed XML comments)
    * Fix issue 255 (sanitizer doesn't strip unsafe URI schemes)

5.0 - January 25, 2011
    * Improved MathML support
    * Support microformats (rel-tag, rel-enclosure, xfn, hcard)
    * Support IRIs
    * Allow safe CSS through sanitization
    * Allow safe HTML5 through sanitization
    * Support SVG
    * Support inline XML entity declarations
    * Support unescaped quotes and angle brackets in attributes
    * Support additional date formats
    * Added the `request_headers` argument to parse()
    * Added the `response_headers` argument to parse()
    * Support multiple entry, feed, and source authors
    * Officially make Python 2.4 the earliest supported version
    * Support Python 3
    * Bug fixes, bug fixes, bug fixes

===============================================================================

1.0 - 9/27/2002 - MAP - fixed namespace processing on prefixed RSS 2.0 elements,
  added Simon Fell's test suite

1.1 - 9/29/2002 - MAP - fixed infinite loop on incomplete CDATA sections

2.0 - 10/19/2002
  JD - use inchannel to watch out for image and textinput elements which can
  also contain title, link, and description elements
  JD - check for isPermaLink='false' attribute on guid elements
  JD - replaced openAnything with open_resource supporting ETag and
  If-Modified-Since request headers
  JD - parse now accepts etag, modified, agent, and referrer optional
  arguments
  JD - modified parse to return a dictionary instead of a tuple so that any
  etag or modified information can be returned and cached by the caller

2.0.1 - 10/21/2002 - MAP - changed parse() so that if we don't get anything
  because of etag/modified, return the old etag/modified to the caller to
  indicate why nothing is being returned

2.0.2 - 10/21/2002 - JB - added the inchannel to the if statement, otherwise its
  useless.  Fixes the problem JD was addressing by adding it.

2.1 - 11/14/2002 - MAP - added gzip support

2.2 - 1/27/2003 - MAP - added attribute support, admin:generatorAgent.
  start_admingeneratoragent is an example of how to handle elements with
  only attributes, no content.

2.3 - 6/11/2003 - MAP - added USER_AGENT for default (if caller doesn't specify);
  also, make sure we send the User-Agent even if urllib2 isn't available.
  Match any variation of backend.userland.com/rss namespace.

2.3.1 - 6/12/2003 - MAP - if item has both link and guid, return both as-is.

2.4 - 7/9/2003 - MAP - added preliminary Pie/Atom/Echo support based on Sam Ruby's
  snapshot of July 1 <http://www.intertwingly.net/blog/1506.html>; changed
  project name

2.5 - 7/25/2003 - MAP - changed to Python license (all contributors agree);
  removed unnecessary urllib code -- urllib2 should always be available anyway;
  return actual url, status, and full HTTP headers (as result['url'],
  result['status'], and result['headers']) if parsing a remote feed over HTTP --
  this should pass all the HTTP tests at <http://diveintomark.org/tests/client/http/>;
  added the latest namespace-of-the-week for RSS 2.0

2.5.1 - 7/26/2003 - RMK - clear opener.addheaders so we only send our custom
  User-Agent (otherwise urllib2 sends two, which confuses some servers)

2.5.2 - 7/28/2003 - MAP - entity-decode inline xml properly; added support for
  inline <xhtml:body> and <xhtml:div> as used in some RSS 2.0 feeds

2.5.3 - 8/6/2003 - TvdV - patch to track whether we're inside an image or
  textInput, and also to return the character encoding (if specified)

2.6 - 1/1/2004 - MAP - dc:author support (MarekK); fixed bug tracking
  nested divs within content (JohnD); fixed missing sys import (JohanS);
  fixed regular expression to capture XML character encoding (Andrei);
  added support for Atom 0.3-style links; fixed bug with textInput tracking;
  added support for cloud (MartijnP); added support for multiple
  category/dc:subject (MartijnP); normalize content model: 'description' gets
  description (which can come from description, summary, or full content if no
  description), 'content' gets dict of base/language/type/value (which can come
  from content:encoded, xhtml:body, content, or fullitem);
  fixed bug matching arbitrary Userland namespaces; added xml:base and xml:lang
  tracking; fixed bug tracking unknown tags; fixed bug tracking content when
  <content> element is not in default namespace (like Pocketsoap feed);
  resolve relative URLs in link, guid, docs, url, comments, wfw:comment,
  wfw:commentRSS; resolve relative URLs within embedded HTML markup in
  description, xhtml:body, content, content:encoded, title, subtitle,
  summary, info, tagline, and copyright; added support for pingback and
  trackback namespaces

2.7 - 1/5/2004 - MAP - really added support for trackback and pingback
  namespaces, as opposed to 2.6 when I said I did but didn't really;
  sanitize HTML markup within some elements; added mxTidy support (if
  installed) to tidy HTML markup within some elements; fixed indentation
  bug in _parse_date (FazalM); use socket.setdefaulttimeout if available
  (FazalM); universal date parsing and normalization (FazalM): 'created', modified',
  'issued' are parsed into 9-tuple date format and stored in 'created_parsed',
  'modified_parsed', and 'issued_parsed'; 'date' is duplicated in 'modified'
  and vice-versa; 'date_parsed' is duplicated in 'modified_parsed' and vice-versa

2.7.1 - 1/9/2004 - MAP - fixed bug handling &quot; and &apos;.  fixed memory
  leak not closing url opener (JohnD); added dc:publisher support (MarekK);
  added admin:errorReportsTo support (MarekK); Python 2.1 dict support (MarekK)

2.7.4 - 1/14/2004 - MAP - added workaround for improperly formed <br/> tags in
  encoded HTML (skadz); fixed unicode handling in normalize_attrs (ChrisL);
  fixed relative URI processing for guid (skadz); added ICBM support; added
  base64 support

2.7.5 - 1/15/2004 - MAP - added workaround for malformed DOCTYPE (seen on many
  blogspot.com sites); added _debug variable

2.7.6 - 1/16/2004 - MAP - fixed bug with StringIO importing

3.0b3 - 1/23/2004 - MAP - parse entire feed with real XML parser (if available);
  added several new supported namespaces; fixed bug tracking naked markup in
  description; added support for enclosure; added support for source; re-added
  support for cloud which got dropped somehow; added support for expirationDate

3.0b4 - 1/26/2004 - MAP - fixed xml:lang inheritance; fixed multiple bugs tracking
  xml:base URI, one for documents that don't define one explicitly and one for
  documents that define an outer and an inner xml:base that goes out of scope
  before the end of the document

3.0b5 - 1/26/2004 - MAP - fixed bug parsing multiple links at feed level

3.0b6 - 1/27/2004 - MAP - added feed type and version detection, result['version']
  will be one of SUPPORTED_VERSIONS.keys() or empty string if unrecognized;
  added support for creativeCommons:license and cc:license; added support for
  full Atom content model in title, tagline, info, copyright, summary; fixed bug
  with gzip encoding (not always telling server we support it when we do)

3.0b7 - 1/28/2004 - MAP - support Atom-style author element in author_detail
  (dictionary of 'name', 'url', 'email'); map author to author_detail if author
  contains name + email address

3.0b8 - 1/28/2004 - MAP - added support for contributor

3.0b9 - 1/29/2004 - MAP - fixed check for presence of dict function; added
  support for summary

3.0b10 - 1/31/2004 - MAP - incorporated ISO-8601 date parsing routines from
  xml.util.iso8601

3.0b11 - 2/2/2004 - MAP - added 'rights' to list of elements that can contain
  dangerous markup; fiddled with decodeEntities (not right); liberalized
  date parsing even further

3.0b12 - 2/6/2004 - MAP - fiddled with decodeEntities (still not right);
  added support to Atom 0.2 subtitle; added support for Atom content model
  in copyright; better sanitizing of dangerous HTML elements with end tags
  (script, frameset)

3.0b13 - 2/8/2004 - MAP - better handling of empty HTML tags (br, hr, img,
  etc.) in embedded markup, in either HTML or XHTML form (<br>, <br/>, <br />)

3.0b14 - 2/8/2004 - MAP - fixed CDATA handling in non-wellformed feeds under
  Python 2.1

3.0b15 - 2/11/2004 - MAP - fixed bug resolving relative links in wfw:commentRSS;
  fixed bug capturing author and contributor URL; fixed bug resolving relative
  links in author and contributor URL; fixed bug resolvin relative links in
  generator URL; added support for recognizing RSS 1.0; passed Simon Fell's
  namespace tests, and included them permanently in the test suite with his
  permission; fixed namespace handling under Python 2.1

3.0b16 - 2/12/2004 - MAP - fixed support for RSS 0.90 (broken in b15)

3.0b17 - 2/13/2004 - MAP - determine character encoding as per RFC 3023

3.0b18 - 2/17/2004 - MAP - always map description to summary_detail (Andrei);
  use libxml2 (if available)

3.0b19 - 3/15/2004 - MAP - fixed bug exploding author information when author
  name was in parentheses; removed ultra-problematic mxTidy support; patch to
  workaround crash in PyXML/expat when encountering invalid entities
  (MarkMoraes); support for textinput/textInput

3.0b20 - 4/7/2004 - MAP - added CDF support

3.0b21 - 4/14/2004 - MAP - added Hot RSS support

3.0b22 - 4/19/2004 - MAP - changed 'channel' to 'feed', 'item' to 'entries' in
  results dict; changed results dict to allow getting values with results.key
  as well as results[key]; work around embedded illformed HTML with half
  a DOCTYPE; work around malformed Content-Type header; if character encoding
  is wrong, try several common ones before falling back to regexes (if this
  works, bozo_exception is set to CharacterEncodingOverride); fixed character
  encoding issues in BaseHTMLProcessor by tracking encoding and converting
  from Unicode to raw strings before feeding data to sgmllib.SGMLParser;
  convert each value in results to Unicode (if possible), even if using
  regex-based parsing

3.0b23 - 4/21/2004 - MAP - fixed UnicodeDecodeError for feeds that contain
  high-bit characters in attributes in embedded HTML in description (thanks
  Thijs van de Vossen); moved guid, date, and date_parsed to mapped keys in
  FeedParserDict; tweaked FeedParserDict.has_key to return True if asking
  about a mapped key

3.0fc1 - 4/23/2004 - MAP - made results.entries[0].links[0] and
  results.entries[0].enclosures[0] into FeedParserDict; fixed typo that could
  cause the same encoding to be tried twice (even if it failed the first time);
  fixed DOCTYPE stripping when DOCTYPE contained entity declarations;
  better textinput and image tracking in illformed RSS 1.0 feeds

3.0fc2 - 5/10/2004 - MAP - added and passed Sam's amp tests; added and passed
  my blink tag tests

3.0fc3 - 6/18/2004 - MAP - fixed bug in _changeEncodingDeclaration that
  failed to parse utf-16 encoded feeds; made source into a FeedParserDict;
  duplicate admin:generatorAgent/@rdf:resource in generator_detail.url;
  added support for image; refactored parse() fallback logic to try other
  encodings if SAX parsing fails (previously it would only try other encodings
  if re-encoding failed); remove unichr madness in normalize_attrs now that
  we're properly tracking encoding in and out of BaseHTMLProcessor; set
  feed.language from root-level xml:lang; set entry.id from rdf:about;
  send Accept header

3.0 - 6/21/2004 - MAP - don't try iso-8859-1 (can't distinguish between
  iso-8859-1 and windows-1252 anyway, and most incorrectly marked feeds are
  windows-1252); fixed regression that could cause the same encoding to be
  tried twice (even if it failed the first time)

3.0.1 - 6/22/2004 - MAP - default to us-ascii for all text/* content types;
  recover from malformed content-type header parameter with no equals sign
  ('text/xml; charset:iso-8859-1')

3.1 - 6/28/2004 - MAP - added and passed tests for converting HTML entities
  to Unicode equivalents in illformed feeds (aaronsw); added and
  passed tests for converting character entities to Unicode equivalents
  in illformed feeds (aaronsw); test for valid parsers when setting
  XML_AVAILABLE; make version and encoding available when server returns
  a 304; add handlers parameter to pass arbitrary urllib2 handlers (like
  digest auth or proxy support); add code to parse username/password
  out of url and send as basic authentication; expose downloading-related
  exceptions in bozo_exception (aaronsw); added __contains__ method to
  FeedParserDict (aaronsw); added publisher_detail (aaronsw)

3.2 - 7/3/2004 - MAP - use cjkcodecs and iconv_codec if available; always
  convert feed to UTF-8 before passing to XML parser; completely revamped
  logic for determining character encoding and attempting XML parsing
  (much faster); increased default timeout to 20 seconds; test for presence
  of Location header on redirects; added tests for many alternate character
  encodings; support various EBCDIC encodings; support UTF-16BE and
  UTF16-LE with or without a BOM; support UTF-8 with a BOM; support
  UTF-32BE and UTF-32LE with or without a BOM; fixed crashing bug if no
  XML parsers are available; added support for 'Content-encoding: deflate';
  send blank 'Accept-encoding: ' header if neither gzip nor zlib modules
  are available

3.3 - 7/15/2004 - MAP - optimize EBCDIC to ASCII conversion; fix obscure
  problem tracking xml:base and xml:lang if element declares it, child
  doesn't, first grandchild redeclares it, and second grandchild doesn't;
  refactored date parsing; defined public registerDateHandler so callers
  can add support for additional date formats at runtime; added support
  for OnBlog, Nate, MSSQL, Greek, and Hungarian dates (ytrewq1); added
  zopeCompatibilityHack() which turns FeedParserDict into a regular
  dictionary, required for Zope compatibility, and also makes command-
  line debugging easier because pprint module formats real dictionaries
  better than dictionary-like objects; added NonXMLContentType exception,
  which is stored in bozo_exception when a feed is served with a non-XML
  media type such as 'text/plain'; respect Content-Language as default
  language if not xml:lang is present; cloud dict is now FeedParserDict;
  generator dict is now FeedParserDict; better tracking of xml:lang,
  including support for xml:lang='' to unset the current language;
  recognize RSS 1.0 feeds even when RSS 1.0 namespace is not the default
  namespace; don't overwrite final status on redirects (scenarios:
  redirecting to a URL that returns 304, redirecting to a URL that
  redirects to another URL with a different type of redirect); add
  support for HTTP 303 redirects

4.0 - MAP - support for relative URIs in xml:base attribute; fixed
  encoding issue with mxTidy (phopkins); preliminary support for RFC 3229;
  support for Atom 1.0; support for iTunes extensions; new 'tags' for
  categories/keywords/etc. as array of dict
  {'term': term, 'scheme': scheme, 'label': label} to match Atom 1.0
  terminology; parse RFC 822-style dates with no time; lots of other
  bug fixes

4.1 - MAP - removed socket timeout; added support for chardet library