File: README

package info (click to toggle)
libphp-snoopy 2.0.0-3
  • links: PTS, VCS
  • area: main
  • in suites: bookworm, bullseye, sid
  • size: 156 kB
  • sloc: php: 647; makefile: 2
file content (267 lines) | stat: -rw-r--r-- 7,796 bytes parent folder | download | duplicates (5)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
NAME:

	Snoopy - the PHP net client v2.0.0
	
SYNOPSIS:

	include "Snoopy.class.php";
	$snoopy = new Snoopy;
	
	$snoopy->fetchtext("http://www.php.net/");
	print $snoopy->results;
	
	$snoopy->fetchlinks("http://www.phpbuilder.com/");
	print $snoopy->results;
	
	$submit_url = "http://lnk.ispi.net/texis/scripts/msearch/netsearch.html";
	
	$submit_vars["q"] = "amiga";
	$submit_vars["submit"] = "Search!";
	$submit_vars["searchhost"] = "Altavista";
		
	$snoopy->submit($submit_url,$submit_vars);
	print $snoopy->results;
	
	$snoopy->maxframes=5;
	$snoopy->fetch("http://www.ispi.net/");
	echo "<PRE>\n";
	echo htmlentities($snoopy->results[0]); 
	echo htmlentities($snoopy->results[1]); 
	echo htmlentities($snoopy->results[2]); 
	echo "</PRE>\n";

	$snoopy->fetchform("http://www.altavista.com");
	print $snoopy->results;

DESCRIPTION:

	What is Snoopy?
	
	Snoopy is a PHP class that simulates a web browser. It automates the
	task of retrieving web page content and posting forms, for example.

	Some of Snoopy's features:
	
	* easily fetch the contents of a web page
	* easily fetch the text from a web page (strip html tags)
	* easily fetch the the links from a web page
	* supports proxy hosts
	* supports basic user/pass authentication
	* supports setting user_agent, referer, cookies and header content
	* supports browser redirects, and controlled depth of redirects
	* expands fetched links to fully qualified URLs (default)
	* easily submit form data and retrieve the results
	* supports following html frames (added v0.92)
	* supports passing cookies on redirects (added v0.92)
	
	
REQUIREMENTS:

	Snoopy requires PHP with PCRE (Perl Compatible Regular Expressions),
	and the OpenSSL extension for fetching HTTPS requests.	

CLASS METHODS:

	fetch($URI)
	-----------
	
	This is the method used for fetching the contents of a web page.
	$URI is the fully qualified URL of the page to fetch.
	The results of the fetch are stored in $this->results.
	If you are fetching frames, then $this->results
	contains each frame fetched in an array.
		
	fetchtext($URI)
	---------------	
	
	This behaves exactly like fetch() except that it only returns
	the text from the page, stripping out html tags and other
	irrelevant data.		

	fetchform($URI)
	---------------	
	
	This behaves exactly like fetch() except that it only returns
	the form elements from the page, stripping out html tags and other
	irrelevant data.		

	fetchlinks($URI)
	----------------

	This behaves exactly like fetch() except that it only returns
	the links from the page. By default, relative links are
	converted to their fully qualified URL form.

	submit($URI,$formvars)
	----------------------
	
	This submits a form to the specified $URI. $formvars is an
	array of the form variables to pass.
		
		
	submittext($URI,$formvars)
	--------------------------

	This behaves exactly like submit() except that it only returns
	the text from the page, stripping out html tags and other
	irrelevant data.		

	submitlinks($URI)
	----------------

	This behaves exactly like submit() except that it only returns
	the links from the page. By default, relative links are
	converted to their fully qualified URL form.


CLASS VARIABLES:	(default value in parenthesis)

	$host			the host to connect to
	$port			the port to connect to
	$proxy_host		the proxy host to use, if any
	$proxy_port		the proxy port to use, if any
					proxy can only be used for http URLs, but not https
	$agent			the user agent to masqerade as (Snoopy v0.1)
	$referer		referer information to pass, if any
	$cookies		cookies to pass if any
	$rawheaders		other header info to pass, if any
	$maxredirs		maximum redirects to allow. 0=none allowed. (5)
	$offsiteok		whether or not to allow redirects off-site. (true)
	$expandlinks	whether or not to expand links to fully qualified URLs (true)
	$user			authentication username, if any
	$pass			authentication password, if any
	$accept			http accept types (image/gif, image/x-xbitmap, image/jpeg, image/pjpeg, */*)
	$error			where errors are sent, if any
	$response_code	responde code returned from server
	$headers		headers returned from server
	$maxlength		max return data length
	$read_timeout	timeout on read operations (requires PHP 4 Beta 4+)
					set to 0 to disallow timeouts
	$timed_out		true if a read operation timed out (requires PHP 4 Beta 4+)
	$maxframes		number of frames we will follow
	$status			http status of fetch
	$temp_dir		temp directory that the webserver can write to. (/tmp)
	$curl_path		system path to cURL binary, set to false if none
					(this variable is ignored as of Snoopy v1.2.6)
	$cafile			name of a file with CA certificate(s)
	$capath			name of a correctly hashed directory with CA certificate(s)
					if either $cafile or $capath is set, SSL certificate
					verification is enabled
	

EXAMPLES:

	Example: 	fetch a web page and display the return headers and
				the contents of the page (html-escaped):
	
	include "Snoopy.class.php";
	$snoopy = new Snoopy;
	
	$snoopy->user = "joe";
	$snoopy->pass = "bloe";
	
	if($snoopy->fetch("http://www.slashdot.org/"))
	{
		echo "response code: ".$snoopy->response_code."<br>\n";
		while(list($key,$val) = each($snoopy->headers))
			echo $key.": ".$val."<br>\n";
		echo "<p>\n";
		
		echo "<PRE>".htmlspecialchars($snoopy->results)."</PRE>\n";
	}
	else
		echo "error fetching document: ".$snoopy->error."\n";



	Example:	submit a form and print out the result headers
				and html-escaped page:

	include "Snoopy.class.php";
	$snoopy = new Snoopy;
	
	$submit_url = "http://lnk.ispi.net/texis/scripts/msearch/netsearch.html";
	
	$submit_vars["q"] = "amiga";
	$submit_vars["submit"] = "Search!";
	$submit_vars["searchhost"] = "Altavista";

		
	if($snoopy->submit($submit_url,$submit_vars))
	{
		while(list($key,$val) = each($snoopy->headers))
			echo $key.": ".$val."<br>\n";
		echo "<p>\n";
		
		echo "<PRE>".htmlspecialchars($snoopy->results)."</PRE>\n";
	}
	else
		echo "error fetching document: ".$snoopy->error."\n";



	Example:	showing functionality of all the variables:
	

	include "Snoopy.class.php";
	$snoopy = new Snoopy;

	$snoopy->proxy_host = "my.proxy.host";
	$snoopy->proxy_port = "8080";
	
	$snoopy->agent = "(compatible; MSIE 4.01; MSN 2.5; AOL 4.0; Windows 98)";
	$snoopy->referer = "http://www.microsnot.com/";
	
	$snoopy->cookies["SessionID"] = 238472834723489l;
	$snoopy->cookies["favoriteColor"] = "RED";
	
	$snoopy->rawheaders["Pragma"] = "no-cache";
	
	$snoopy->maxredirs = 2;
	$snoopy->offsiteok = false;
	$snoopy->expandlinks = false;
	
	$snoopy->user = "joe";
	$snoopy->pass = "bloe";
	
	if($snoopy->fetchtext("http://www.phpbuilder.com"))
	{
		while(list($key,$val) = each($snoopy->headers))
			echo $key.": ".$val."<br>\n";
		echo "<p>\n";
		
		echo "<PRE>".htmlspecialchars($snoopy->results)."</PRE>\n";
	}
	else
		echo "error fetching document: ".$snoopy->error."\n";


	Example: 	fetched framed content and display the results
	
	include "Snoopy.class.php";
	$snoopy = new Snoopy;
	
	$snoopy->maxframes = 5;
	
	if($snoopy->fetch("http://www.ispi.net/"))
	{
		echo "<PRE>".htmlspecialchars($snoopy->results[0])."</PRE>\n";
		echo "<PRE>".htmlspecialchars($snoopy->results[1])."</PRE>\n";
		echo "<PRE>".htmlspecialchars($snoopy->results[2])."</PRE>\n";
	}
	else
		echo "error fetching document: ".$snoopy->error."\n";


COPYRIGHT:
	Copyright(c) 1999,2000 ispi. All rights reserved.
	This software is released under the GNU General Public License.
	Please read the disclaimer at the top of the Snoopy.class.php file.


THANKS:
	Special Thanks to:
	Peter Sorger <sorgo@cool.sk> help fixing a redirect bug
	Andrei Zmievski <andrei@ispi.net> implementing time out functionality
	Patric Sandelin <patric@kajen.com> help with fetchform debugging
	Carmelo <carmelo@meltingsoft.com> misc bug fixes with frames