File: README

package info (click to toggle)
www-search 1.007-1
  • links: PTS
  • area: main
  • in suites: hamm, slink
  • size: 220 kB
  • ctags: 78
  • sloc: perl: 1,075; makefile: 36
file content (254 lines) | stat: -rw-r--r-- 8,136 bytes parent folder | download
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254

WWW::Search and AutoSearch
==========================


WHAT IS NEW WITH WWW::Search 1.007?
-----------------------------------

- new: back-ends for Dejanews (from Cesare Feroldi de Rosa),
	Infoseek (also from Cesare Feroldi de Rosa),
	and Excite (from GLen Pringle)
- new: more fields in SearchResult (score, dates, etc., see the man page)
	(problem found by Cesare Feroldi de Rosa)
- new: better error handling on network failures
	(AutoSearch should report errors on its pages,
	$search->response() provides an API for error reporting)
- new (internal):  user_agent handling has changed
- new:  proxy support added to WWW::Search (still needed in applications)
	(problem and fix suggested by T. V. Raman)
- bug-fix: numerous documentation updates
	(problems found by Larry Virden)
- bug-fix: AltaVista web search was occasionally dropping hits
	(problem found by Larry Virden, fixed by Bill Scheding)
- bug-fix: all non-alphanumeric characters are now escaped
	(problem found by Larry Virden)


WHAT IS WWW::Search?
--------------------

WWW::Search is a collection of Perl modules which provide an API to
WWW search engines.  Currently WWW::Search includes back-ends for
variations of AltaVista, Dejanews, Excite, HotBot, Infoseek, and
Lycos.  We include two applications built from this library:
AutoSearch (an program to automate tracking of search results over
time), a small demonstration program to drive the library.
Back-ends for other search engines and more sophisticated clients are
currently under development.



WHAT IS AutoSearch?
-------------------

WWW::Search's primary client is AutoSearch.  AutoSearch performs a
web-based search and puts the results set in a web page.  It
periodically updates this web page, indicating how the search changes
over time.  Sample output from WWW::Search can be found at
<http://www.isi.edu/lsam/autosearch/>.  Output format is configurable.

See the man page for AutoSearch details, or Demonstration section
below for the quick-start instructions.



REQUIREMENTS
------------

WWW::Search requires Perl5 and libwww-perl.
For information on Perl5, see <http://www.perl.com>.
For libwww-perl, see <http://www.sn.no/libwww-perl/>.
Both are also available from the Comprehensive Perl Archive
Network (CPAN). Visit <http://www.perl.com/CPAN/> to find a CPAN
site near you.

At this time WWW::Search has been tested with Perl versions 5.002 and
5.003.



AVAILABILITY
------------

The latest version of WWW::Search should always be available from
<http://www.isi.edu/lsam/tools/WWW_SEARCH/>.

WWW::Search is also available as part of CPAN.  Visit
<http://www.perl.com/CPAN/> to find a CPAN site near you.

Feedback about WWW::Search is encouraged.  If you're using it for a
neat application, please let us know.  If you'd like to (or have)
implemented a new back-end for WWW::Search, let us know so we don't
duplicate work.




INSTALLATION
------------

In order to use this package you will need Perl version 5.002 or
better.  You install WWW::Search, as you would install any perl module
library, by running these commands:

    perl Makefile.PL
    make
    make install

If you want to install a private copy of WWW::Search in your home
directory, then you should try to produce the initial Makefile with
something like this command:

    perl Makefile.PL PREFIX=~/perl


Note:  make of the current release reports the following error which can be ignored:

    /usr/local/bin/pod2man: bad option in paragraph 35 of lib/WWW/Search.pm: ``native_query('search-engine-specific+query+string',
        { option1 => 'able', option2 => 'baker' } )'' should be [LCI]<native_query('search-engine-specific+query+string',
        { option1 => 'able', option2 => 'baker' } )>



DEMONSTRATION
-------------

After installing the client programs,
try
	search '"Your Name Here"'
to see who's talking about you on the web.

Then (in your web page directory), try
	AutoSearch -n 'me on the web' -s '"Your Name Here"' me
and the web page me/index.html will be created summarizing
this information.
Then add
	0 3 * * 1 AutoSearch /path/to/your/web/pages/me
to your crontab(1) to update this search once a week.



DOCUMENTATION
-------------

See WWW/Search.pm for an overview of the library.
POD-style documentation is included in all modules
and scripts.  These are normally converted to manual pages and
installed as part of the "make install" process.  You should also be
able to use the 'perldoc' utility to extract documentation from the
module files directly.



FUTURE PLANS
------------

Some ideas:

  - application-level proxy support (I'm looking for a contribution
	here from someone who uses/needs proxy support)

  - more widespread use of new results tags across all back-ends

  - a test suite

  - a freeze/restore interface to suspend and resume in-progress queries

  - more back-ends

Other than a test suite I don't have major plans in the immediate
future (through 1Q1997); WWW::Search will be in maintenence mode.

Contributions from others are always welcome.  Send me e-mail if you
plan a new back-end and to discuss architectural changes (to avoid
duplicating work).



RELEASE HISTORY
---------------

1.002:  (11 October 1996)
- First public release.

1.004:  (31 October 1996)
- new:  AutoSearch, a client application (see below for details)
- new:  WWW::Search is now in CPAN (see GETTING WWW::Search for details)
- bug fix:  installation problem (no rule to make CLIENTS/search) fixed

1.005:  (12 November 1996)
- new: back-ends for HotBot, Lycos, and several AltaVista variants
- new: application support for search-engine selection
- new: application and library support for search-engine options

1.006:  (25 November 1996)
- private beta release, see 1.007 for list of new features



SUPPORT AND CREDITS
-------------------

The WWW::Search architecture is by John Heidemann with feedback
from the other contributors.  Components of AltaVista have been
written by several people:

APPLICATIONS:
	search			John Heidemann <johnh@isi.edu>
	AutoSearch 		William Scheding <wls@isi.edu>

BACK-ENDS:
	AltaVista		John Heidemann
	Dejanews		Cesare Feroldi de Rosa <C.Feroldi@it.net>
	Excite			GLen Pringle <pringle@cs.monash.edu.au>
	HotBot			William Scheding
	Infoseek		Cesare Feroldi de Rosa
	Lycos			William Scheding

AutoSearch is based on an earlier implementation by Kedar Jog with
advice from Joe Touch.

Bugs and extensions (to the software and documentation) have been
identified by William Scheding <wls@isi.edu>, T. V. Raman
<raman@adobe.com> (proxy support, fix included), C. Feroldi
<C.Feroldi@it.net> (fix included), Larry Virden <lvirden@cas.org> (fix
included).

Feedback, bug reports and fixes, and new back-ends should be sent to
John Heidemann <johnh@isi.edu>.



COPYRIGHT
---------

Copyright (c) 1996 University of Southern California.
All rights reserved.                                            
                                                               
Redistribution and use in source and binary forms are permitted
provided that the above copyright notice and this paragraph are
duplicated in all such forms and that any documentation, advertising
materials, and other materials related to such distribution and use
acknowledge that the software was developed by the University of
Southern California, Information Sciences Institute.  The name of the
University may not be used to endorse or promote products derived from
this software without specific prior written permission.

THIS SOFTWARE IS PROVIDED "AS IS" AND WITHOUT ANY EXPRESS OR IMPLIED
WARRANTIES, INCLUDING, WITHOUT LIMITATION, THE IMPLIED WARRANTIES OF
MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE.


Portions of this README are derived from the README for libwww-perl.



ISPELL
------

 LocalWords:  AltaVista Lycos Hotbot WebCrawler libwww perl com sn CPAN isi PL
 LocalWords:  lsam pl pm perldoc README LocalWords AutoSearch Search's html usr
 LocalWords:  crontab HotBot autosearch Scheding Kedar Dejanews Infoseek lib de
 LocalWords:  SearchResult LCI wls Cesare Feroldi GLen Pringle pringle monash
 LocalWords:  au Raman raman Virden lvirden cas org