1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193
|
package WWW::Noss::Timestamp;
use 5.016;
use strict;
use warnings;
our $VERSION = '1.10';
use Time::Piece;
my %MONTHS = (
'jan' => '01',
'feb' => '02',
'mar' => '03',
'apr' => '04',
'may' => '05',
'jun' => '06',
'jul' => '07',
'aug' => '08',
'sep' => '09',
'oct' => '10',
'nov' => '11',
'dec' => '12',
);
# Regex taken from the loose parser in the DateTime::Format::Mail module.
my $mail_rx = qr{
^ \s*
# Optional week day name
(?i:
(?:Mon|Tue|Wed|Thu|Fri|Sat|Sun|[A-Z][a-z][a-z]) ,? # Day name + comma
)?
\s*
(?<dom>\d{1,2}) # Day of month
[-\s]*
(?i: (?<month> Jan|Feb|Mar|Apr|May|Jun|Jul|Aug|Sep|Oct|Nov|Dec) ) # month
[-\s]*
(?<year>(?:\d\d)?\d\d) # year
\s+
(?<hour>\d?\d):(?<min>\d?\d) (?: :(?<sec>\d?\d) )? # hour:min:sec
# Optional time zone
(?:
\s+ "? (?<tz>
[+-] \d{4} # standard
| [A-Z]+ # obsolete (ignored)
| GMT [+-] \d+ # empirical (converted)
| [A-Z]+\d+ # wierd emprical (ignored)
| [a-zA-Z/]+ # linux (ignored)
| [+-]{0,2} \d{3,5} # corrupted standard
) "?
)?
(?: \s+ \([^\)]+\) )? # friendly tz name; empirical
\s* \.? $
}x;
# Regex adapted from DateTime::Format::RFC3339.
my $rfc3339_rx = qr{
^
# yyyy-mm-dd
(?<year> \d{4})-(?<month> \d{2})-(?<dom> \d{2})
T # date/time seperator
# hh:mm:ss
(?<hour> \d{2}):(?<min> \d{2}):(?<sec> \d{2})
# nanoseconds (ignored)
(?: \. \d{1,9}\d*)?
(?<tz>
Z # UTC (zulu)
| [+-]\d{2}:\d{2}
)
$
}x;
sub mail {
my ($class, $time) = @_;
$time =~ $mail_rx or return undef;
my $dom = sprintf "%02d", $+{ dom };
my $month = $MONTHS{ lc $+{ month } };
my $year =
length $+{ year } == 4
? $+{ year }
: $+{ year } >= 69
? "19$+{ year }"
: "20$+{ year }";
my $hour = sprintf "%02d", $+{ hour } // 0;
my $min = sprintf "%02d", $+{ min } // 0;
my $sec = sprintf "%02d", $+{ sec } // 0;
my $tz =
(defined $+{ tz } and $+{ tz } =~ /^([+-])(\d{4})$/)
? $1 . sprintf "%04d", $2
: '-0000';
my $tp = eval {
Time::Piece->strptime(
join(' ', $dom, $month, $year, $hour, $min, $sec, $tz),
'%d %m %Y %H %M %S %z',
);
};
return defined $tp ? $tp->epoch : undef;
}
sub rfc3339 {
my ($class, $time) = @_;
$time =~ $rfc3339_rx or return undef;
my $year = $+{ year };
my $month = $+{ month };
my $dom = $+{ dom };
my $hour = $+{ hour };
my $min = $+{ min };
my $sec = $+{ sec };
my $tz =
$+{ tz } eq 'Z'
? '-0000'
: $+{ tz } =~ s/://gr;
my $tp = eval {
Time::Piece->strptime(
join(' ', $year, $month, $dom, $hour, $min, $sec, $tz),
'%Y %m %d %H %M %S %z'
);
};
return defined $tp ? $tp->epoch : undef;
}
1;
=head1 NAME
WWW::Noss::Timestamp - Parse timestamps
=head1 USAGE
use WWW::Noss::Timestamp;
my $epoch = WWW::Noss::Timestamp->rfc3339(
'2025-07-12T00:23:00Z'
);
=head1 DESCRIPTION
B<WWW::Noss::Timestamp> is a module that provides methods for parsing various
timestamp formats used by RSS and Atom feeds. This is a private module, please
consult the L<noss(1)> manual for user documentation.
=head1 METHODS
Each method is invoked as a class method. Methods will return the timestamp's
seconds since the Unix epoch or C<undef> on failure.
=over 4
=item $epoch = WWW::Noss::Timestamp->mail($str)
Parse RFC2822/822 timestamps, used by RSS feeds. This is a lenient parser that
is capable of parsing some non-standard timestamps.
=item $epoch = WWW::Noss::Timestamp->rfc3339($str)
Parse RFC3339 timestamps, used by Atom feeds.
=back
=head1 AUTHOR
Written by Samuel Young, E<lt>samyoung12788@gmail.comE<gt>.
This project's source can be found on its
L<Codeberg page|https://codeberg.org/1-1sam/noss.git>. Comments and pull
requests are welcome!
=head1 COPYRIGHT
Copyright (C) 2025 Samuel Young
This program is free software: you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation, either version 3 of the License, or
(at your option) any later version.
=head1 SEE ALSO
L<noss>
=cut
# vim: expandtab shiftwidth=4
|