File: README

package info (click to toggle)
libhtml-microformats-perl 0.105-6
  • links: PTS, VCS
  • area: main
  • in suites: bookworm, sid, trixie
  • size: 1,340 kB
  • sloc: perl: 14,121; makefile: 10; sh: 1
file content (272 lines) | stat: -rw-r--r-- 9,109 bytes parent folder | download | duplicates (2)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
NAME
    HTML::Microformats - parse microformats in HTML

SYNOPSIS
     use HTML::Microformats;
 
     my $doc = HTML::Microformats
                 ->new_document($html, $uri)
                 ->assume_profile(qw(hCard hCalendar));
     print $doc->json(pretty => 1);
 
     use RDF::TrineShortcuts qw(rdf_query);
     my $results = rdf_query($sparql, $doc->model);

DESCRIPTION
    The HTML::Microformats module is a wrapper for parser and handler
    modules of various individual microformats (each of those modules has a
    name like HTML::Microformats::Format::Foo).

    The general pattern of usage is to create an HTML::Microformats object
    (which corresponds to an HTML document) using the "new_document" method;
    then ask for the data, as a Perl hashref, a JSON string, or an
    RDF::Trine model.

  Constructor
    "$doc = HTML::Microformats->new_document($html, $uri, %opts)"
        Constructs a document object.

        $html is the HTML or XHTML source (string) or an
        XML::LibXML::Document.

        $uri is the document URI, important for resolving relative URL
        references.

        %opts are additional parameters; currently only one option is
        defined: $opts{'type'} is set to 'text/html' or
        'application/xhtml+xml', to control how $html is parsed.

  Profile Management
    HTML::Microformats uses HTML profiles (i.e. the profile attribute on the
    HTML <head> element) to detect which Microformats are used on a page.
    Any microformats which do not have a profile URI declared will not be
    parsed.

    Because many pages fail to properly declare which profiles they use,
    there are various profile management methods to tell HTML::Microformats
    to assume the presence of particular profile URIs, even if they're
    actually missing.

    "$doc->profiles"
        This method returns a list of profile URIs declared by the document.

    "$doc->has_profile(@profiles)"
        This method returns true if and only if one or more of the profile
        URIs in @profiles is declared by the document.

    "$doc->add_profile(@profiles)"
        Using "add_profile" you can add one or more profile URIs, and they
        are treated as if they were found in the document.

        For example:

         $doc->add_profile('http://microformats.org/profile/rel-tag')

        This is useful for adding profile URIs declared outside the document
        itself (e.g. in HTTP headers).

        Returns a reference to the document.

    "$doc->assume_profile(@microformats)"
        For example:

         $doc->assume_profile(qw(hCard adr geo))

        This method acts similarly to "add_profile" but allows you to use
        names of microformats rather than URIs.

        Microformat names are case sensitive, and must match
        HTML::Microformats::Format::Foo module names.

        Returns a reference to the document.

    "$doc->assume_all_profiles"
        This method is equivalent to calling "assume_profile" for all known
        microformats.

        Returns a reference to the document.

  Parsing Microformats
    Generally speaking, you can skip this. The "data", "json" and "model"
    methods will automatically do this for you.

    "$doc->parse_microformats"
        Scans through the document, finding microformat objects.

        On subsequent calls, does nothing (as everything is already parsed).

        Returns a reference to the document.

    "$doc->clear_microformats"
        Forgets information gleaned by "parse_microformats" and thus allows
        "parse_microformats" to be run again. This is useful if you've
        modified or added some profiles between runs of "parse_microformats".

        Returns a reference to the document.

  Retrieving Data
    These methods allow you to retrieve the document's data, and do things
    with it.

    "$doc->objects($format);"
        $format is, for example, 'hCard', 'adr' or 'RelTag'.

        Returns a list of objects of that type. (If called in scalar
        context, returns an arrayref.)

        Each object is, for example, an HTML::Microformat::hCard object, or
        an HTML::Microformat::RelTag object, etc. See the relevant
        documentation for details.

    "$doc->all_objects"
        Returns a hashref of data. Each hashref key is the name of a
        microformat (e.g. 'hCard', 'RelTag', etc), and the values are
        arrayrefs of objects.

        Each object is, for example, an HTML::Microformat::hCard object, or
        an HTML::Microformat::RelTag object, etc. See the relevant
        documentation for details.

    "$doc->json(%opts)"
        Returns data roughly equivalent to the "all_objects" method, but as
        a JSON string.

        %opts is a hash of options, suitable for passing to the JSON
        module's to_json function. The 'convert_blessed' and 'utf8' options
        are enabled by default, but can be disabled by explicitly setting
        them to 0, e.g.

          print $doc->json( pretty=>1, canonical=>1, utf8=>0 );

    "$doc->model"
        Returns data as an RDF::Trine::Model, suitable for serialising as
        RDF or running SPARQL queries.

    "$object->serialise_model(as => $format)"
        As "model" but returns a string.

    "$doc->add_to_model($model)"
        Adds data to an existing RDF::Trine::Model.

        Returns a reference to the document.

  Utility Functions
    "HTML::Microformats->modules"
        Returns a list of Perl modules, each of which implements a specific
        microformat.

    "HTML::Microformats->formats"
        As per "modules", but strips 'HTML::Microformats::Format::' off the
        module name, and sorts alphabetically.

WHY ANOTHER MICROFORMATS MODULE?
    There already exist two microformats packages on CPAN (see
    Text::Microformat and Data::Microformat), so why create another?

    Firstly, HTML::Microformats isn't being created from scratch. It's
    actually a fork/clean-up of a non-CPAN application (Swignition), and in
    that sense predates Text::Microformat (though not Data::Microformat).

    It has a number of other features that distinguish it from the existing
    packages:

    *   It supports more formats.

        HTML::Microformats supports hCard, hCalendar, rel-tag, geo, adr,
        rel-enclosure, rel-license, hReview, hResume, hRecipe, xFolk, XFN,
        hAtom, hNews and more.

    *   It supports more patterns.

        HTML::Microformats supports the include pattern, abbr pattern, table
        cell header pattern, value excerpting and other intricacies of
        microformat parsing better than the other modules on CPAN.

    *   It offers RDF support.

        One of the key features of HTML::Microformats is that it makes data
        available as RDF::Trine models. This allows your application to
        benefit from a rich, feature-laden Semantic Web toolkit. Data
        gleaned from microformats can be stored in a triple store; output in
        RDF/XML or Turtle; queried using the SPARQL or RDQL query languages;
        and more.

        If you're not comfortable using RDF, HTML::Microformats also makes
        all its data available as native Perl objects.

BUGS
    Please report any bugs to <http://rt.cpan.org/>.

SEE ALSO
    HTML::Microformats::Documentation::Notes.

    Individual format modules:

    *   HTML::Microformats::Format::adr

    *   HTML::Microformats::Format::figure

    *   HTML::Microformats::Format::geo

    *   HTML::Microformats::Format::hAtom

    *   HTML::Microformats::Format::hAudio

    *   HTML::Microformats::Format::hCalendar

    *   HTML::Microformats::Format::hCard

    *   HTML::Microformats::Format::hListing

    *   HTML::Microformats::Format::hMeasure

    *   HTML::Microformats::Format::hNews

    *   HTML::Microformats::Format::hProduct

    *   HTML::Microformats::Format::hRecipe

    *   HTML::Microformats::Format::hResume

    *   HTML::Microformats::Format::hReview

    *   HTML::Microformats::Format::hReviewAggregate

    *   HTML::Microformats::Format::OpenURL_COinS

    *   HTML::Microformats::Format::RelEnclosure

    *   HTML::Microformats::Format::RelLicense

    *   HTML::Microformats::Format::RelTag

    *   HTML::Microformats::Format::species

    *   HTML::Microformats::Format::VoteLinks

    *   HTML::Microformats::Format::XFN

    *   HTML::Microformats::Format::XMDP

    *   HTML::Microformats::Format::XOXO

    Similar modules: RDF::RDFa::Parser, HTML::HTML5::Microdata::Parser,
    XML::Atom::Microformats, Text::Microformat, Data::Microformats.

    Related web sites: <http://microformats.org/>,
    <http://www.perlrdf.org/>.

AUTHOR
    Toby Inkster <tobyink@cpan.org>.

COPYRIGHT AND LICENCE
    Copyright 2008-2012 Toby Inkster

    This library is free software; you can redistribute it and/or modify it
    under the same terms as Perl itself.

DISCLAIMER OF WARRANTIES
    THIS PACKAGE IS PROVIDED "AS IS" AND WITHOUT ANY EXPRESS OR IMPLIED
    WARRANTIES, INCLUDING, WITHOUT LIMITATION, THE IMPLIED WARRANTIES OF
    MERCHANTIBILITY AND FITNESS FOR A PARTICULAR PURPOSE.