File: README

package info (click to toggle)
dejasearch 1.8.4-1
  • links: PTS
  • area: main
  • in suites: potato
  • size: 140 kB
  • ctags: 21
  • sloc: perl: 773; makefile: 49
file content (249 lines) | stat: -rw-r--r-- 9,905 bytes parent folder | download
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
DEJASEARCH Version 1.8.4

AUTHORS     : Chew Wei Yih, Victor <vchew@post1.com>
              Steffen Ullrich <coyote.frank@gmx.net>
              Frank de Lange <frank@unternet.org>
              J.R. Tietsort <jrtietsort@micron.com>
              Dan Shiovitz <dans@drizzle.com>

CONTRIBUTORS: Andre Majorel <amajorel@teaser.fr>

URL         : http://homemade.hypermart.net/dejasearch/


1. OVERVIEW

DejaSearch is a frontend to Deja.com (http://www.deja.com/), the leading
Usenet archive and search engine. Deja.com is a great resource to uncover
even the most obscure information from the sea of data that is Usenet. I
frequently use it to find answers to technical problems, or to learn about
how people rate particular products before making a more informed purchase.

DejaSearch will submit a search for you to Deja.com, then retrieve and
consolidate all search results into one single HTML file, sorted in
newsgroup, subject and date (reverse) order. This means related messages will
be closer to one another, with the more recent messages nearer to the top.
This also means you can print out the entire file and read the messages at
your leisure, instead of having to go through them one by one on the screen.


2. DEPENDENCIES

DejaSearch should be able to run on all Unix systems with a Perl interpreter
installed. At present, it has only been tested on Linux.

It has been reported that Dejasearch also runs with:

- ActivePerl (for Windoze)
- Apache (for Windoze)


3. USAGE 

The usage pattern is of the form:

  dejasearch [-proxy p] [-max m] [-output o] <search keywords>

The parameters for DejaSearch are explained in more detail below:

  -proxy    <URL>      Proxy server in http://hostname:port format
                       (Alternatively, you can make use of the
                       "http_proxy" environment variable)
  -max      <num>      Maximum number of messages to retrieve
  -output   <filename> Output HTML file (default: summary.html)
  -type     <type>     Valid values are recent, old and all (default: all)
  -format   <type>     Deja.com search results format.
                       Valid values are classic, new or mbox (default: new)
  -fromdate <date>     Date to limit search from (eg. Apr+1+1997)
  -todate   <date>     Date to limit search to (eg. Apr+8+1997)
  -[no]status          Display download status. (default: yes)
  -[no]verbose         Display search status. (default: yes)
                       Note that -noverbose implies -nostatus.
  -sleep    <secs>     Sleep given number of secs between each retrieval.
                       (default: 0)

Some examples:

  dejasearch linux ip masquerading ppp
    This will search for all messages containing "linux", "ip", "masquerading"
    and "ppp" and output all messages to "summary.html".

  dejasearch "linux [ip masquerading] ppp"
    This will search for all messages containing "linux", "ip masquerading"
    and "ppp" and output all messages to "summary.html". Since I don't know
    of any method to pass double quotes to the shell (I use tcsh), I chose
    '[' and ']' as an alias for specifying a phrase.

  dejasearch "linux & (ip ^ masquerading) & ppp"
    This will search for all messages containing "linux", "ip" and "masquerading"
    close to one another and "ppp" and output all messages to "summary.html".

  dejasearch -max 50 -output results.html linux ip masquerading ppp
    This will search for up to 50 messages containing "linux", "ip", "masquerading"
    and "ppp" and output all messages to "results.html".


4. DEJA.COM SEARCH LANGUAGE QUICK REFERENCE

   Keywords can be separated by the following connectors: 

      &  - AND           e.g. beans & rice
      |  - OR                camel | llama
      &! - AND NOT           clam &! chowder
      ^  - NEAR              lucas ^ spielberg

   Keywords can be combined with the following symbols: 

      "..." - Quote Marks    "the far side"
        *   - Wildcard       psych*
      (...) - Parentheses    scully & (xfiles | x-files)
      {...} - Braces         {monkey monkeying}

   Keywords can be preceded by the following context operators: 

      ~a  - Author           ~a demos@deja.com
      ~s  - Subject          ~s chess
      ~g  - Newsgroup        ~g alt.love
      ~dc - Creation date    ~dc 1996/12/31


5. USING DEJASEARCH AS A CGI SCRIPT

First copy dejasearch into your cgi-bin directory with the proper access
permissions. Then access dejasearch with your browser! It's as simple as
that! 

  eg. http://my.site.com/cgi-bin/dejasearch

Some additional information below (mostly copied from Frank's email to me):

- On non-frames browsers it presents a frameless interface with a simple,
  single-line search interface

- On frames-capable (actually, all browsers but those in the
  non_frames_browsers array) it presents either a two- or three-paned
  interface:

- If the user chooses 'Headers only' it uses the deja.com primary results
  pages to filter out the subject, author, forum and date. These are then
  presented in list format in the second frame. When the user clicks on one
  of those links, the results are presented in the third frame. This way,
  you never have to switch browser windows or use the back-button to see
  more results.

- If the user chooses 'Show messages', the second frame contains the normal
  dejasearch output.

The script also has some new options:

 - new or classic interface (option -format new|classic)

   Deja.com actually still offers the original, usable (in contrast to the
   current bloated portal-monster) interface. You can reach this original
   interface using the =dnc (dejanewsclassic?) modifier in the url. Have a
   look at the %deja hash in the source for the url's and search patterns to
   use...
   
   The classic interface is much cleaner and faster, but it does not offer
   the colored responses like the new interface. Even so, on a modem
   connection the difference is quite noticeable (as in twice as fast). No
   more than expected of course, since the new interface is so bandwidth-
   hungry.

   By the way, both classic and new interfaces still offer the 'Text
   only' option using the fmt=text modifier. This way, you get the speediest
   deja.com ever, but you loose the pre-formatted URL's etc. I think I will
   add this as a third interface option though. I might even include my own
   url formatting et al.

       http://www.deja.com/=dnc/[ST_rn=qs]/getdoc.xp?AN=474042698&fmt=text
       (classic interface)

       http://x31.deja.com/[ST_rn=qs]/getdoc.xp?AN=474042698&fmt=text
       (new interface)

   These two URL's should give the same results.

The script does NOT need to be started from a static HTML

Note: If you are running your web server behind a firewall, make sure you
set the proper proxy settings for the web server account, or hardcode the
proxy settings into DejaSearch.


6. USING DEJASEARCH WITH LYNX' SIMULATED CGI SUPPORT

(This tip was contributed by Morten Bo Johansen <mojo@image.dk>)

This is a nice tip for Lynx users: If only you want to access a cgi script
occasionally to have it deliver its results to you locally, running a
webserver such as Apache is quite an overkill. For this purpose Lynx can be
configured to access cgi-scripts without any httpd deamon running in the
background. To enable this simply configure Lynx with cgi-links enabled..

    ./configure --enable-cgi-links

prior to building.

Once Lynx is compiled and installed place the dejasearch perl script
anywhere you like it, e.g. /usr/local/httpd/cgi-bin/dejasearch and access it
from the Lynx prompt with this line

    lynxcgi:/usr/local/httpd/cgi-bin/dejasearch

The dejasearch search form will appear on your screen and you're ready to
go!

Note: This is tested with Lynx ver. 2.8.3dev8 and Dejasearch 1.65. There
were some problems with Lynx ver. 2.8.2 and Dejasearch ver. 1.64 in that it
seemed that the messages themselves could not be retrieved from an index.
The cause and effect in this is not investigated. You may also be able to
use Lynx 2.8.2 and Dejasearch ver. 1.65. for instance.


7. AUTO-LAUNCHING THE SEARCH RESULTS IN YOUR BROWSER

(This tip was contributed by Bill Goffe <Bill.Goffe@usm.edu>)

  I just got a copy of dejasearch, and I like it quite a lot. I wrote the
  following script (hardy worthy of the name) that puts the results in a
  Netscape window. Maybe something like it could be part of dejasearch? 
  Output to just summary.html is kinda dull.

     #!/usr/bin/perl
     $ds_args = join(" ", @ARGV);
     `dejasearch $ds_args -output /tmp/ds.html`;
     `netscape -remote 'openURL(file:/tmp/ds.html)'`;

I would prefer not to integrate this into DejaSearch, since it is browser-
dependent. Therefore for those of you who needs this, you can refer to
Bill's script.

The above only works with Netscape browsers, and the browser must already be
opened. However, it should be easy to adapt this to work with other
browsers.


8. USING DEJASEARCH WITH AN AUTHENTICATING PROXY

If you are behind a proxy which requires authentication, edit dejasearch and
find the string "$auth". Then modify it to:

    $auth = "username:password";

Alternatively, you can create a file called ".dejasearchrc" in your home
directory with the proxy authentication information in "user-name:password"
format (on a single line and without the quotes). The mode of this file must
be 400 or 600, or DejaSearch aborts with a non-zero status.


9. ACKNOWLEDGMENT

I would like to thank the GNU people. I don't know them personally, but they
have blessed us with free and great tools such as Linux, gcc, emacs, Perl,
fetchmail etc. which I now use on a daily basis. In the trails of their
selfless spirit, I will also like to share DejaSearch in the same way, and
hope many people besides me find it useful.

Thanks to all who provided valuable feedback/patches to make this program
better. Godspeed!