File: index_hypermail.txt

package info (click to toggle)
hypermail 2.2.0-1
  • links: PTS
  • area: main
  • in suites: etch, etch-m68k
  • size: 5,732 kB
  • ctags: 3,176
  • sloc: ansic: 34,794; sh: 13,432; yacc: 844; makefile: 775; perl: 744; python: 292
file content (126 lines) | stat: -rw-r--r-- 3,845 bytes parent folder | download | duplicates (2)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
From Bill Moseley:

I noticed this page on indexing with Swish-e:

  http://hypermail.org/source/docs/archive_search.html

Here's another way if using Perl.

The Swish-e (http://swish-e.org) package comes with a script for
indexing hypermail archives and includes instructions on usage.  Below
I've included part of the documentation.

That script is in use here:

  http://swish-e.org/Discussion/archive/

The search results page is not that pretty, but it's just using the
default templates.  I'm often frustrated when searching archives that I
can't limit by date or author.  So besides being able to just search by
name, email, title, etc., you can do a search like:

  hypermail name=moseley

to find posts with "hypermail" in either the title or body, and also
only messages from moseley.


Here's the (un-proof read) instructions:


NAME
    index_hypermail.pl - Parse Hypermail archive for indexing with Swish-e

SYNOPSIS
    Using an example data structure like this:

        hypermail/
            archive/
            search/

    Create the hypermail archive:

        $ cd hypermail
        $ hypermail -i -d archive < messages.mbox

    Create a swish-e config file:

        $ cd search
        $ cat swish.conf

        # config for indexing hypermail v2.1.8 archives

        IndexDir ./index_hypermail.pl
        SwishProgParameters ../archive

        MetaNames swishtitle name email
        PropertyNames name email
        IndexContents HTML* .html
        StoreDescription HTML* <body> 100000
        UndefinedMetaTags  ignore

    Copy index_hypermail.pl to the current directory. Swish-e installs
    index_hypermail.pl in the $prefix/share/doc/swish-e/examples/prog-bin
    directory, where $prefix is typically "/usr/local" or simply "/usr" on
    some distributions.

        $ cp /usr/local/share/doc/swish-e/example/prog-bin/index_hypermail .

    Then

    Index the documents:

        $ swish-e -c swish.conf -S prog

    Now create the search interface:

        $ cp /usr/local/lib/swish-e/swish.cgi .
        $ cat .swishcgi.conf

        $ENV{TZ} = 'UTC'; # display dates in UTC format

        return {
            title           => "Search the Foo List Archive",
            display_props   => [qw/ name email swishlastmodified /],
            sorts           => [qw/swishrank swishtitle email swishlastmodified/],
            metanames       => [qw/swishdefault swishtitle name email/],
            name_labels     => {
                swishrank           =>  'Rank',
                swishtitle          =>  'Subject Only',
                name                =>  "Poster's Name",
                email               =>  "Poster's Email",
                swishlastmodified   =>  'Message Date',
                swishdefault        =>  'Subject & Body',
            },

            highlight       => {
                package         => 'SWISH::PhraseHighlight',

                xhighlight_on    => '<font style="background:#FFFF99">',
                xhighlight_off   => '</font>',

                meta_to_prop_map => {   # this maps search metatags to display properties
                    swishdefault    => [ qw/swishtitle swishdescription/ ],
                    swishtitle      => [ qw/swishtitle/ ],
                    email           => [ qw/email/ ],
                    name            => [ qw/name/ ],
                    swishdocpath    => [ qw/swishdocpath/ ],
                },
            },
        };

    Setup web server (OS/web server dependent):

        /var/www # ln -s /path/to/hypermail/search
        /var/www # ln -s /path/to/hypermail/archive

    and maybe tell apache to run the script:

        $ cat .htaccess 
        Deny from all
        <files swish.cgi>
            Allow from all
            SetHandler cgi-script
            Options +ExecCGI
        </files>