File: chronicle-entry-filter

package info (click to toggle)
chronicle 4.4-1
  • links: PTS
  • area: main
  • in suites: squeeze
  • size: 656 kB
  • ctags: 108
  • sloc: perl: 2,323; makefile: 132; sh: 9
file content (621 lines) | stat: -rwxr-xr-x 10,957 bytes parent folder | download
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
#!/usr/bin/perl -w

=head1 NAME

chronicle-entry-filter - Convert blog files to HTML, if required.

=cut

=head1 SYNOPSIS

  Help Options

    --help        Show a brief help overview.
    --version     Show the version of this script.

  Options

    --format      The global format of all entries.
    --filename    The name of the single file to process.

  Filters

    --pre-filter  A filter to run before convertion to HTML.
    --post-filter A filter to run after HTML conversion.

=cut

=head1 ABOUT

This script is designed to receive a filename and a global formatting type
upon the command line.  The formatting type specifies how the blog entry
file will be processed:

  1.  If the format is "textile" the file will be converted from textile
     to HTML.

  2.  If the format is "markdown" the file will be converted from markdown
     to HTML.  The related format "multimarkdown" is also recognised.

  3.  If the format is "html" no changes will be made.

Once the conversion has been applied the code will also be scanned for
<code> tags to expand via the B<Text::VimColour> module, if it is installed,
which allows the pretty-printing of source code.

To enable the syntax highlighting of code fragments you should format your
code samples as follows:

=for example begin

   Subject: Some highlighted code.
   Date: 25th December 2009
   Tags: chronicle, perl, blah

   <p>Here is some code which will look pretty ..</p>
   <code lang="perl">
   #!/usr/bin/perl -w
   ...
   ..
   </code>

=for example end

Notice the use of lang="perl", which provides a hint as to the type of
syntax highlighting to apply.

Additionally you may make use of the pre-filter and post-filter pseudo-headers
which allow you to transform the entry in further creative fashions.

For example you might wish the blog to be upper-case only for some reason,
and this could be achieved via:

=for example begin

  Subject: I DONT LIKE LOWER CASE
  Tags: meta, random, silly
  Date: 25th December 2009
  Pre-Filter: perl -pi -e "s/__USER__/`whoami`/g"
  Post-filter: tr [a-z] [A-Z]

  <p>This post, written by __USER__ will have no lower-case values.</p>
  <p>Notice how my username was inserted too?</p>

=for example end

You may chain arbitrarily complex filters together via the filters.  Each
filter should read the entry on STDIN and return the updated content to
STDOUT.

(If you wish to apply a global filter simply pass that as an argument to
chronicle, or in your chroniclerc file.)

=cut

=head1 AUTHOR

 Steve
 --
 http://www.steve.org.uk/

=cut

=head1 LICENSE

Copyright (c) 2009-2010 by Steve Kemp.  All rights reserved.

This module is free software;
you can redistribute it and/or modify it under
the same terms as Perl itself.
The LICENSE file contains the full text of the license.

=cut


use strict;
use warnings;


use Getopt::Long;
use IPC::Open2;
use Pod::Usage;
use Symbol;


#
#  Release number
#
#  NOTE:  Set by 'make release'.
#
my $RELEASE = '4.4';




#
#  Dispatch table of input type converters.
#
#  Each entry will have (up to) the following two keys:
#
#    module  => Any optional modules required - multiple comma-separated
#               values are permissable.
#
#    routine => The routine to convert the input to the HTML output.
#
my %dispatch = (
    "html" => { routine => \&do_html, },

    "markdown" => { module  => "Text::Markdown",
                    routine => \&do_markdown,
                  },

    "multimarkdown" => { module  => "Text::MultiMarkdown",
                         routine => \&do_multimarkdown,
                       },

    "textile" => { module  => "Text::Textile",
                   routine => \&do_textile,
                 } );




#
#  Parse the command line options.
#
my %CONFIG = parseCommandLineArguments();


#
#  If we don't have a filename then it is game over.
#
if ( !$CONFIG{ 'filename' } )
{
    print "Mandatory filename missing: Help?!\n";
    exit 1;
}



#
#  Read the input from the file
#
my ( $text, %headers ) = readInputFile( $CONFIG{ 'filename' } );


#
#  Pre-filter?
#
my $pre = $CONFIG{ 'pre-filter' } || $headers{ 'pre-filter' } || undef;
if ( defined($pre) )
{
    $text = runFilter( $pre, $text );
}



#
# At this point we need to work out how to format the entry.
#
# We might have (in order of precedence):
#
#  a. A per-entry format
#  b. A global format.
#  c. The default format (html)
#
my $format = $headers{ 'format' } || $CONFIG{ 'format' } || "html";


#
#  Lookup details to use in the dispatch table.
#
my $obj = $dispatch{ lc $format };
if ( !$obj )
{
    print "The input method $format is unknown.\n";
    exit 1;
}

#
#  Do we have to load an optional module?
#
if ( $obj->{ 'module' } )
{
    loadOptionalModules( $obj->{ 'module' }, $format );
}

#
#  Now convert
#
my $html = $obj->{ 'routine' }->($text);


#
#  Do code formatting
#
$html =~ s{<code lang=['"]([^'"]+)['"]>(.*?)(</code>)}
          {"<code>" . highlightCode($2, $1) . $3}msegi;


#
#  Post-filter?
#
my $post = $CONFIG{ 'post-filter' } || $headers{ 'post-filter' } || undef;
if ( defined($post) )
{
    $html = runFilter( $post, $html );
}



#
#  Finally output the result such that chronicle can include it
# in the blog.
#
#  Ensure we're UTF-8 clean.
#
binmode STDOUT, ":utf8";
print $html;

#
#  All over :)
#
exit 0;




=begin doc

Parse the two command line options we expect to receive.

TODO: Add help/version/manual/etc

=end doc

=cut

sub parseCommandLineArguments
{
    my $HELP    = 0;
    my $MANUAL  = 0;
    my $VERSION = 0;

    my %options;

    if (
        !GetOptions(

            # help options
            "help",    \$HELP,
            "manual",  \$MANUAL,
            "verbose", \$options{ 'verbose' },
            "version", \$VERSION,

            # filename
            "filename=s", \$options{ 'filename' },

            # global format
            "format=s", \$options{ 'format' },

            # filters
            "pre-filter=s",  \$options{ 'pre-filter' },
            "post-filter=s", \$options{ 'post-filter' },
        ) )
    {
        exit 1;
    }


    pod2usage(1) if $HELP;
    pod2usage( -verbose => 2 ) if $MANUAL;

    if ($VERSION)
    {
        print("chronicle-entry-filter release $RELEASE\n");
        exit;
    }

    return (%options);
}



=begin doc

Read the specified blog file, and return both the input format
and the body of the file.

Ignore all other header values.
=end doc

=cut

sub readInputFile
{
    my ($filename) = (@_);

    #
    #  Open the specified file.
    #
    open my $handle, "<:utf8", $filename or
      die "Failed to open file\n";


    #
    #  Parse the header and body into these values
    #
    my %headers;
    my $body;


    #
    #  Read the file.
    #
    my $header = 1;
    foreach my $line (<$handle>)
    {
        if ($header)
        {

            #
            #  If the header is "foo:bar" then record that
            #
            if ( $line =~ /^([^:]+):(.*)/ )
            {
                my $key = $1;
                my $val = $2;

                $key = lc($key);
                $val =~ s/^\s+|\s+$//g;

                $headers{ $key } = $val
                  if ( length($val) && !$headers{ $key } );
            }

            #
            #  End of the header?
            #
            # NOTE: Slight hack for working under Cygwin on
            #       Microsoft Windows where \r and \n roam wild.
            #
            $header = 0 if ( $line =~ /^([\r|\n]*)$/ );
        }
        else
        {
            $body .= $line;
        }

    }
    close($handle);

    return ( $body, %headers );
}




=begin doc

Run the text we've got through the specified command.

The command will receive the text on STDIN and should return the
(potentially modified) text to STDOUT.

=end doc

=cut

sub runFilter
{
    my ( $cmd, $text ) = (@_);

    my $WTR = gensym();
    my $RDR = gensym();

    $CONFIG{ 'verbose' } && print "Running filter: $cmd\n";

    my $pid = open2( $RDR, $WTR, $cmd );

    print $WTR $text;
    close($WTR);

    my $result = "";
    while (<$RDR>)
    {
        $result .= $_;
    }

    waitpid $pid, 0;
    return $result;
}



=begin doc

Load an optional module.

=end doc

=cut

sub loadOptionalModules
{
    my ( $module, $format ) = (@_);

    foreach my $mod ( split( /,/, $module ) )
    {

        #
        #  Strip space, and empty modules
        #
        $mod =~ s/^\s+|\s+$//g;
        next if ( !length($mod) );

        #
        #  Make sure we have the module installed.  Use eval to
        # avoid making this mandatory.
        #
        my $test = "use $mod;";

        #
        #  Test loading the module.
        #
        ## no critic (Eval)
        eval($test);
        ## use critic

        if ($@)
        {
            my $package = "lib" . lc($mod) . "-perl";
            $package =~ s/::/-/g;

            print <<EOF;

You have chosen to format your input text via the $format format, but the
Perl module $mod is not installed.

Aborting.

Upon a Debian GNU/Linux system you can probably correct this via:

  apt-get install $package
EOF
            exit 1;
        }
    }

}



=begin doc

 Convert from HTML to HTML.
 (i.e. NOP)

=end doc

=cut

sub do_html
{
    my ($text) = (@_);

    return ($text);
}



=begin doc

  Convert from markdown to HTML.

=end doc

=cut

sub do_markdown
{
    my ($text) = (@_);

    return ( Text::Markdown::markdown($text) );
}


=begin doc

  Convert from multimarkdown to HTML.

=end doc

=cut

sub do_multimarkdown
{
    my ($text) = (@_);

    return ( Text::MultiMarkdown::markdown($text) );
}



=begin doc

  Convert from textile to HTML.

=end doc

=cut

sub do_textile
{
    my ($text) = (@_);

    #
    #  Convert, via the textile helper.
    #
    my $textile = new Text::Textile;

    if ( defined( $CONFIG{ 'charset' } ) )
    {
        $CONFIG{ 'verbose' } &&
          print "Formatting via textile with charset $CONFIG{'charset'}\n";

        $textile->charset( $CONFIG{ 'charset' } );
    }

    #
    #  Now return HTML
    #
    my $html = $textile->process($text);
    return ($html);
}



=begin doc

Attempt to highlight the given text with the given language bindings.

Note that this relies upon Text::VimColor...

=end doc

=cut

sub highlightCode
{
    my ( $text, $lang ) = (@_);


    #
    #  Make sure we have the Text::VimColor  module installed.  Use eval to
    # avoid making this mandatory.
    #
    my $test = "use Text::VimColor;";

    #
    #  Test loading the module.
    #
    ## no critic (Eval)
    eval($test);
    ## use critic

    #
    #  If there was an error then we'll ignore the highlighting.
    #
    if ($@)
    {
        return $text;
    }


    my $syntax = Text::VimColor->new( string     => $text,
                                      filetype   => $lang,
                                      stylesheet => 1,
                                    );

    return ( $syntax->html );
}