1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96
|
#!perl
=head1 NAME
marcmaker.t -- Tests for MARC::File::MARCMaker.
=head1 TO DO
Compare decoded and encoded versions of records in camel.mrk and camel.usmarc with each other.
Determine how to link as_marcmaker method from MARC::File::MARCMaker to MARC::Field without "only once" warning.
More comprehensive tests of character encoding/decoding.
=cut
use strict;
use warnings;
use Test::More tests=>10;
BEGIN { use_ok( 'MARC::Batch' ); }
BEGIN { use_ok( 'MARC::File::USMARC' ); }
BEGIN { use_ok( 'MARC::File::MARCMaker' ); }
if (UNIVERSAL::can('MARC::Field', 'as_marcmaker')) {
warn "MARC::Field now has an as_marcmaker() method";
}
else {
no warnings;
*MARC::Field::as_marcmaker = *MARC::File::MARCMaker::as_marcmaker;
}
###################################################
###################################################
#create MARC::Record object for manipulation
my $record = MARC::Record->new();
isa_ok( $record, 'MARC::Record', 'MARC record' );
$record->leader("00000nam 2200253 a 4500");
my $nfields = $record->add_fields(
#control number so one is present
['001', "ttt05000001"
],
#basic 008
['008', "050801s2005 ilu 000 0 eng d"
],
#basic 245
[245, "0","0",
a => "Test record from text /",
c => "Bryan Baldus ... [et al.].",
],
[500, '', '',
a => 'This is a test of ordinary features like replacement of the mnemonics for currency and dollar signs ($) and backslashes (backsolidus \ ) used for blanks in certain areas.'
],
[500, '', '',
a => 'This is a test for the conversion of curly braces; the opening curly brace ( { ) and the closing curly brace ( } ).'
],
[500, '', '',
a => "This is a test of diacritics like the uppercase Polish L in odz, the uppercase Scandinavia O in st, the uppercase D with crossbar in uro, the uppercase Icelandic thorn in ann, the uppercase digraph AE in gir, the uppercase digraph OE in uvres, the soft sign in rech, the middle dot in colleccio, the musical flat in F, the patent mark in Frizbee, the plus or minus sign in 54%, the uppercase O-hook in B, the uppercase U-hook in XA, the alif in masalah, the ayn in arab, the lowercase Polish l in Wocaw, the lowercase Scandinavian o in Kbenhavn, the lowercase d with crossbar in avola, the lowercase Icelandic thorn in ann, the lowercase digraph ae in vre, the lowercase digraph oe in cur, the lowercase hardsign in sezd, the Turkish dotless i in masal, the British pound sign in 5.95, the lowercase eth in verur, the lowercase o-hook (with pseudo question mark) in S, the lowercase u-hook in T Dc, the pseudo question mark in cui, the grave accent in tres, the acute accent in desiree, the circumflex in cote, the tilde in manana, the macron in Tokyo, the breve in russkii, the dot above in zaba, the dieresis (umlaut) in Lowenbrau, the caron (hachek) in crny, the circle above (angstrom) in arbok, the ligature first and second halves in diadia, the high comma off center in rozdelovac, the double acute in idoszaki, the candrabindu (breve with dot above) in Aliiev, the cedilla in ca va comme ca, the right hook in vieta, the dot below in teda, the double dot below in khutbah, the circle below in Samskrta, the double underscore in Ghulam, the left hook in Lech Waesa, the right cedilla (comma below) in khong, the upadhmaniya (half circle below) in humantus, double tilde, first and second halves in ngalan, high comma (centered) in geotermika.",
],
[650, '', '0',
a => 'MARC records.',
],
);
is( $nfields, 7, "All the fields added OK" );
my @fields500 = $record->field('500');
is($fields500[0]->as_marcmaker(), "=500 \\\\\$aThis is a test of ordinary features like replacement of the mnemonics for currency and dollar signs ({dollar}) and backslashes (backsolidus {bsol} ) used for blanks in certain areas.\n", join "\t", "Dollars and backslashes test ok", $fields500[0]->as_marcmaker());
is($fields500[1]->as_marcmaker(), "=500 \\\\\$aThis is a test for the conversion of curly braces; the opening curly brace ( {lcub} ) and the closing curly brace ( {rcub} ).\n", join "\t", "Curly braces test ok", $fields500[1]->as_marcmaker());
my $rec_as_maker = MARC::File::MARCMaker::encode($record);
my $record_from_maker = MARC::File::MARCMaker::decode($rec_as_maker);
isa_ok( $record_from_maker, 'MARC::Record', 'MARC record from MARCMaker data' );
my @recoded_500s = $record_from_maker->field('500');
print $recoded_500s[0]->as_string(), "\n";
print $recoded_500s[1]->as_string(), "\n";
#######################
### Diacritics test ###
#######################
is ($fields500[2]->as_marcmaker(), "=500 \\\\\$aThis is a test of diacritics like the uppercase Polish L in {Lstrok}{acute}od{acute}z, the uppercase Scandinavia O in {Ostrok}st, the uppercase D with crossbar in {Dstrok}uro, the uppercase Icelandic thorn in {THORN}ann, the uppercase digraph AE in {AElig}gir, the uppercase digraph OE in {OElig}uvres, the soft sign in rech{softsign}, the middle dot in col{middot}lecci{acute}o, the musical flat in F{flat}, the patent mark in Frizbee{reg}, the plus or minus sign in {plusmn}54%, the uppercase O-hook in B{Ohorn}, the uppercase U-hook in X{Uhorn}A, the alif in mas{mlrhring}alah, the ayn in {mllhring}arab, the lowercase Polish l in W{lstrok}oc{lstrok}aw, the lowercase Scandinavian o in K{ostrok}benhavn, the lowercase d with crossbar in {dstrok}avola, the lowercase Icelandic thorn in {thorn}ann, the lowercase digraph ae in v{aelig}re, the lowercase digraph oe in c{oelig}ur, the lowercase hardsign in s{hardsign}ezd, the Turkish dotless i in masal{inodot}, the British pound sign in {pound}5.95, the lowercase eth in ver{eth}ur, the lowercase o-hook (with pseudo question mark) in S{hooka}{ohorn}, the lowercase u-hook in T{uhorn} D{uhorn}c, the pseudo question mark in c{hooka}ui, the grave accent in tr{grave}es, the acute accent in d{acute}esir{acute}ee, the circumflex in c{circ}ote, the tilde in ma{tilde}nana, the macron in T{macr}okyo, the breve in russki{breve}i, the dot above in {dot}zaba, the dieresis (umlaut) in L{uml}owenbr{uml}au, the caron (hachek) in {caron}crny, the circle above (angstrom) in {ring}arbok, the ligature first and second halves in d{llig}i{rlig}ad{llig}i{rlig}a, the high comma off center in rozdel{rcommaa}ovac, the double acute in id{dblac}oszaki, the candrabindu (breve with dot above) in Ali{candra}iev, the cedilla in {cedil}ca va comme {cedil}ca, the right hook in viet{ogon}a, the dot below in te{dotb}da, the double dot below in {under}k{under}hu{dbldotb}tbah, the circle below in Sa{dotb}msk{ringb}rta, the double underscore in {dblunder}Ghulam, the left hook in Lech Wa{lstrok}{commab}esa, the right cedilla (comma below) in kh{rcedil}ong, the upadhmaniya (half circle below) in {breveb}humantu{caron}s, double tilde, first and second halves in {ldbltil}n{rdbltil}galan, high comma (centered) in g{commaa}eotermika.\n", "Diacritics to mnemonics ok.");
is ($recoded_500s[2]->as_string(), "This is a test of diacritics like the uppercase Polish L in odz, the uppercase Scandinavia O in st, the uppercase D with crossbar in uro, the uppercase Icelandic thorn in ann, the uppercase digraph AE in gir, the uppercase digraph OE in uvres, the soft sign in rech, the middle dot in colleccio, the musical flat in F, the patent mark in Frizbee, the plus or minus sign in 54%, the uppercase O-hook in B, the uppercase U-hook in XA, the alif in masalah, the ayn in arab, the lowercase Polish l in Wocaw, the lowercase Scandinavian o in Kbenhavn, the lowercase d with crossbar in avola, the lowercase Icelandic thorn in ann, the lowercase digraph ae in vre, the lowercase digraph oe in cur, the lowercase hardsign in sezd, the Turkish dotless i in masal, the British pound sign in 5.95, the lowercase eth in verur, the lowercase o-hook (with pseudo question mark) in S, the lowercase u-hook in T Dc, the pseudo question mark in cui, the grave accent in tres, the acute accent in desiree, the circumflex in cote, the tilde in manana, the macron in Tokyo, the breve in russkii, the dot above in zaba, the dieresis (umlaut) in Lowenbrau, the caron (hachek) in crny, the circle above (angstrom) in arbok, the ligature first and second halves in diadia, the high comma off center in rozdelovac, the double acute in idoszaki, the candrabindu (breve with dot above) in Aliiev, the cedilla in ca va comme ca, the right hook in vieta, the dot below in teda, the double dot below in khutbah, the circle below in Samskrta, the double underscore in Ghulam, the left hook in Lech Waesa, the right cedilla (comma below) in khong, the upadhmaniya (half circle below) in humantus, double tilde, first and second halves in ngalan, high comma (centered) in geotermika.", "Diacritics test ok");
#=500 \\$aThis is a test of diacritics like the uppercase Polish L in {Lstrok}{acute}od{acute}z, the uppercase Scandinavia O in {Ostrok}st, the uppercase D with crossbar in {Dstrok}uro, the uppercase Icelandic thorn in {THORN}ann, the uppercase digraph AE in {AElig}gir, the uppercase digraph OE in {OElig}uvres, the soft sign in rech{softsign}, the middle dot in col{middot}lecci{acute}o, the musical flat in F{flat}, the patent mark in Frizbee{reg}, the plus or minus sign in {plusmn}54%, the uppercase O-hook in B{Ohorn}, the uppercase U-hook in X{Uhorn}A, the alif in mas{mlrhring}alah, the ayn in {mllhring}arab, the lowercase Polish l in W{lstrok}oc{lstrok}aw, the lowercase Scandinavian o in K{ostrok}benhavn, the lowercase d with crossbar in {dstrok}avola, the lowercase Icelandic thorn in {thorn}ann, the lowercase digraph ae in v{aelig}re, the lowercase digraph oe in c{oelig}ur, the lowercase hardsign in s{hardsign}ezd, the Turkish dotless i in masal{inodot}, the British pound sign in {pound}5.95, the lowercase eth in ver{eth}ur, the lowercase o-hook (with pseudo question mark) in S{hooka}{ohorn}, the lowercase u-hook in T{uhorn} D{uhorn}c, the pseudo question mark in c{hooka}ui, the grave accent in tr{grave}es, the acute accent in d{acute}esir{acute}ee, the circumflex in c{circ}ote, the tilde in ma{tilde}nana, the macron in T{macr}okyo, the breve in russki{breve}i, the dot above in {dot}zaba, the dieresis (umlaut) in L{uml}owenbr{uml}au, the caron (hachek) in {caron}crny, the circle above (angstrom) in {ring}arbok, the ligature first and second halves in d{llig}i{rlig}ad{llig}i{rlig}a, the high comma off center in rozdel{rcommaa}ovac, the double acute in id{dblac}oszaki, the candrabindu (breve with dot above) in Ali{candra}iev, the cedilla in {cedil}ca va comme {cedil}ca, the right hook in viet{ogon}a, the dot below in te{dotb}da, the double dot below in {under}k{under}hu{dbldotb}tbah, the circle below in Sa{dotb}msk{ringb}rta, the double underscore in {dblunder}Ghulam, the left hook in Lech Wa{lstrok}{commab}esa, the right cedilla (comma below) in kh{rcedil}ong, the upadhmaniya (half circle below) in {breveb}humantu{caron}s, double tilde, first and second halves in {ldbltil}n{rdbltil}galan, high comma (centered) in g{commaa}eotermika.
|