1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202
|
NAME
XML::Atom::Microformats - parse microformats in Atom content
SYNOPSIS
use XML::Atom::Microformats;
my $feed = XML::Atom::Microformats
-> new_feed( $xml, $base_uri )
-> assume_profile( qw(hCard hCalendar) );
print $feed->json(pretty => 1);
my $results = RDF::Query
-> new( $sparql )
-> execute( $feed->model );
DESCRIPTION
The XML::Atom::Microformats module brings the functionality of
HTML::Microformats to Atom 1.0 Syndication feeds. It finds microformats
embedded in the <content> elements (note: not <summary>) of Atom
entries.
The general pattern of usage is to create an XML::Atom::Microformats
object (which corresponds to an Atom 1.0 feed) using the "new_feed"
method; tell it which types of Microformat you expect to find there;
then ask for the data, as a Perl hashref, a JSON string, or an
RDF::Trine model.
Constructor
"XML::Atom::Microformats->new_feed($xml, $base_url)"
Constructs a feed object.
$xml is the Atom source (string) or an XML::LibXML::Document.
$base_url is the feed URL, important for resolving relative URL
references.
Profile Management Methods
HTML::Microformats uses HTML profiles (i.e. the profile attribute on the
HTML <head> element) to detect which Microformats are used on a page.
Any microformats which do not have a profile URI declared will not be
parsed.
XML::Atom::Microformats uses a similar mechanism. Because Atom does not
have a <head> element, Atom <link> is used instead:
<link rel="profile" href="http://ufs.cc/x/hcalendar" />
These links can be used on a per-entry basis, or for the whole feed.
Because many feeds fail to properly declare which profiles they use,
there are various profile management methods to tell
XML::Atom::Microformats to assume the presence of particular profile
URIs, even if they're actually missing.
"add_profile(@profiles)"
Using "add_profile" you can add one or more profile URIs, and they
are treated as if they were found on the document.
For example:
$feed->add_profile('http://microformats.org/profile/rel-tag')
This is useful for adding profile URIs declared outside the document
itself (e.g. in HTTP headers).
"entry_add_profile($entryid, @profiles)"
"entry_add_profile" is a variant to allow you to add a profile which
applies only to one specific entry within the feed, if you know that
entry's ID.
"assume_profile(@microformats)"
For example:
$feed->assume_profile(qw(hCard adr geo))
This method acts similarly to "add_profile" but allows you to use
names of microformats rather than URIs. Microformat names are case
sensitive, and must match HTML::Microformats::Format::Foo module
names.
"entry_assume_profile($entryid, @profiles)"
"entry_assume_profile" is a variant to allow you to add a profile
which applies only to one specific entry within the feed, if you
know that entry's ID.
"assume_all_profiles"
This method is equivalent to calling "assume_profile" for all known
microformats.
"entry_assume_all_profiles($entryid)"
This method is equivalent to calling "entry_assume_profile" for all
known microformats.
Parsing Methods
You can probably skip this section. The "data", "json" and "model"
methods will automatically do this for you.
"parse_microformats"
Scans through the feed, finding microformat objects.
On subsequent calls, does nothing (as everything is already parsed).
"clear_microformats"
Forgets information gleaned by "parse_microformats" and thus allows
"parse_microformats" to be run again. This is useful if you've added
some profiles between runs of "parse_microformats".
Data Retrieval Methods
These methods allow you to retrieve the feed's data, and do things with
it.
"objects($format)"
$format is, for example, 'hCard', 'adr' or 'RelTag'.
Returns a list of objects of that type. (If called in scalar
context, returns an arrayref.)
Each object is, for example, an HTML::Microformat::hCard object, or
an HTML::Microformat::RelTag object, etc. See the relevent
documentation for details.
"entry_objects($entryid, $format)"
"entry_objects" is a variant to allow you to fetch data for one
specific entry within the feed, if you know that entry's ID.
"all_objects"
Returns a hashref of data. Each hashref key is the name of a
microformat (e.g. 'hCard', 'RelTag', etc), and the values are
arrayrefs of objects.
Each object is, for example, an HTML::Microformat::hCard object, or
an HTML::Microformat::RelTag object, etc. See the relevent
documentation for details.
"entry_all_objects($entryid)"
"entry_all_objects" is a variant to allow you to fetch data for one
specific entry within the feed, if you know that entry's ID.
"json(%opts)"
Returns data roughly equivalent to the "all_objects" method, but as
a JSON string.
%opts is a hash of options, suitable for passing to the JSON
module's to_json function. The 'convert_blessed' and 'utf8' options
are enabled by default, but can be disabled by explicitly setting
them to 0, e.g.
print $feed->json( pretty=>1, canonical=>1, utf8=>0 );
"entry_json($entryid, %opts)"
"entry_json" is a variant to allow you to fetch data for one
specific entry within the feed, if you know that entry's ID.
"model(%opts)"
Returns data as an RDF::Trine::Model, suitable for serialising as
RDF or running SPARQL queries. Quads are used (rather than triples)
which allows you to trace statements to the entries from which they
came.
$opts{'atomowl'} is a boolean indicating whether or not to include
data from XML::Atom::OWL in the returned model. If enabled, this
always includes AtomOWL data for the whole feed (not just for a
specific entry), even if you use the "entry_model" method.
If RDF::RDFa::Parser 1.096 or above is installed, then
$opts{'atomowl'} will automatically pull in DataRSS data too.
"entry_model($entryid, %opts)"
"entry_model" is a variant to allow you to fetch data for one
specific entry within the feed, if you know that entry's ID.
"add_to_model($model, %opts)"
Adds data to an existing RDF::Trine::Model. Otherwise, the same as
"model".
"entry_add_to_model($entry, $model, %opts)"
Adds data to an existing RDF::Trine::Model. Otherwise, the same as
"entry_model".
BUGS
Please report any bugs to <http://rt.cpan.org/>.
SEE ALSO
HTML::Microformats, XML::Atom::OWL, XML::Atom::FromOWL,
RDF::RDFa::Parser.
<http://microformats.org/>, <http://www.perlrdf.org/>.
AUTHOR
Toby Inkster <tobyink@cpan.org>.
COPYRIGHT
Copyright 2010-2012 Toby Inkster
This library is free software; you can redistribute it and/or modify it
under the same terms as Perl itself.
DISCLAIMER OF WARRANTIES
THIS PACKAGE IS PROVIDED "AS IS" AND WITHOUT ANY EXPRESS OR IMPLIED
WARRANTIES, INCLUDING, WITHOUT LIMITATION, THE IMPLIED WARRANTIES OF
MERCHANTIBILITY AND FITNESS FOR A PARTICULAR PURPOSE.
|