File: jsbind.txt

package info (click to toggle)
pavuk 0.9.34-4
  • links: PTS
  • area: main
  • in suites: etch, etch-m68k
  • size: 4,744 kB
  • ctags: 3,826
  • sloc: ansic: 51,740; sh: 3,405; makefile: 363
file content (211 lines) | stat: -rw-r--r-- 8,418 bytes parent folder | download | duplicates (2)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
I am quite lazy now to write some serious documentation about this issue,
so below is just some skeleton which I wrote as announcement of this new
feature to pavuk mailing list. Sorry ... more will follow later. When you
have some questions try to ask me directly ...

-------------------------------------------------------------------------

From ondrej@idata.sk Sun Jul 29 21:39:22 2001
Date: Thu, 12 Jul 2001 15:09:05 +0200 (CEST)
From: Stefan Ondrejicka <ondrej@idata.sk>
Reply-To: pavuk@yahoogroups.com
To: pavuk@yahoogroups.com
Subject: pavuk-0.9pl28h available for testing


Hi!

Here is new pavuk testing version which introduces JavaScript bindings for
doing some complicated tasks which need some more complexity than can
achive not scriptable program. At first you need to have javascript
library from Mozilla project installed on your system. It is possible to
download sources from ftp://ftp.mozilla.org/pub/js/ . Compilation is very
simple.

The use of JavaScript scripts within pavuk is not yet documented in manual
so I will try to explain it within this email.

You can load one JavaScript file into pavuk using option -js_script_file.
Currently there are in pavuk two exits where user can insert own
JavaScript functions.

One is inside routine which is doing decision wheter particular URL should
be downloaded or not. If you want insert own JS decision function you must
name it "pavuk_url_cond_check" . The prototype of this function looks
following:

function pavuk_url_cond_check (url, level)
{
}

level is integer number and indicates from which of five different places
in pavuk code is currently pavuk_url_cond_check function called.

level == 0 - condition checking is called from HTML parsing routine.
	At this point you can use all conditions besides -dmax, -min_time,
	-max_time, -max_size, -min_size, -amimet, -dmimet
level == 1 - condition checking is called from routine which is performing
	queuing of URLs into download queue. At this point you can use all
	conditions like in level0 including -dmax.
level == 2 - condition checking is called when URL is taken from download
	queue and will be transfered after this check will be sucessfull.
	At this point you can use same set of conditions like in level1.
level == 3 - condition checking is called after pavuk sent request for
	download and detected document size, modification time and mime
	type. In this level you can use all conditions.

url is object instance of PavukUrl class. It contains all informations
about particular URL and is wrapper for parsed URLs defined in pavuk like
structure of url type. 

It have following attributes:
--- read-write attributes ---
status - (int32, defined always) holds bitfields with different infos
	(look in url.h to see more)
--- read-only attributes defined allways ---
protocol - one of "http" "https" "ftp" "ftps" "file" "gopher" "unknown" 
	   means kind of URL
level - level in document tree at which this URL lies
ref_cnt - number of parent documents which reference this URL
urlstr - full URL string
--- read-only attributes defined when protocol == "http" or "https" ---
http_host - host name or IP address 
http_port - port number
http_document - HTTP document
http_searchstr - query string when available (the part of URL after ?)
http_anchor_name - anchor name when available (the part of URL after #)
http_user - user name for authorization when available
http_password - password for authorization when available
--- read-only attributes defined when protocol == "ftp" or "ftps" ---
ftp_host - host name or IP address
ftp_port - port number
ftp_user - user name for authorization when available
ftp_password - password for authorization when available
ftp_path - path to file or directory
ftp_anchor_name - anchor name when available (the part of URL after #)
ftp_dir - flag wheter this URL points to directory
--- read-only attributes defined when protocol == "file" ---
file_name - path to file or directory
file_searchstr - query string when available (the part of URL after ?)
file_anchor_name - anchor name when available (the part of URL after #)
--- read-only attributes defined when protocol == "gopher" ---
gopher_host - host name or IP address
gopher_port - port number
gopher_selector - selector string
--- read-only attributes defined when protocol == "gopher" ---
unsupported_urlstr - full URL string
--- read-only attributes available when performing checking of conditions ---
check_level - equivalent to level parameter of pavuk_url_cond_check function
mime_type - MIME type of this URL (defined when available)
doc_size - size of document (defined when available)
modification_time - modification time of document (defined when available)
doc_number - number of document in download queue (defined when available)
html_doc - full content of parent document of current URL (defined when
           level == 0)
html_doc_offset - offset of curent HTML tag in parent document of URL
                  (defined when level == 0)
moved_to - get URL to which was this URL moved (define when available)
html_tag - full HTML tag including <> from which is taken current URL
           (defined when level == 0)
tag - name of HTML tag from which is current URL taken (defined when level == 0)
attrib - name of HTML tag attribute from which is current URL taken (defined
	 when level == 0)

And following methods:
get_parent(n) - get URL of n-th parent document
check_cond(name, ....) - check condition which option name is "name".
	when you will not provide additional parameters pavuk will use
	parameters from commadline or scenario file for condition
	checking. Else it will use listed parameters.


Here is some example like pavuk_url_cond_check function can look:

function pavuk_url_cond_check (url, level)
{
	if (level == 0)
	{
		if (url.level > 3 && url.check_cond("-asite", "www.host.com"))
			return false;

		if (url.check_cond("-url_rpattern",
			"http://www.idata.sk/~ondrej/",
			"http://www.idata.sk/~robo/") && 
		    url.check_cond("-dsfx", ".jar", ".tgz", ".png))
			return flase;
	}

	if (level == 2)
	{
		par = url.get_parent();

		if (par && par.get_moved())
			return false;
	}

	return true;
}

The example is useless, but shows you how to use this feature...

Second possible use of JavaScript with pavuk is in -fnrules option for
generating local names. In this case it is done by special function of
extended -fnrules option syntax called "jsf" which have one parameter -
the name of javascript function which will be called. The function must
return string paramter and its prototype is something like following:

function some_jsf_func(fnrule)
{
}

The fnrule parameter is object instance of PavukFnrules class.
it have one attribute
- url which is of PavukUrl type described above
and also have one method
- get_macro(macro) - it returns value of the %x macros used in -fnrules
otion.

You can do something like

-fnrules F "*" '(jsf "some_fnrules_func")'

That's for now all. In future I will implement another posibilities of
scripts uses.

The sources of new pavuk version are availbale at usual place
http://www.idata.sk/~ondrej/sw/pavuk-0.9pl28h.tgz .
And ChangeLog file is available at
http://www.idata.sk/~ondrej/pavuk/ChangeLog.dev .

And here is cut of changes from ChangeLog:

* fixed msgfmt detection in configure script (thanks to Richard Ems)
* fixed compilationa without SSL support (thanks to Richard Ems)
* updated Spanish Message catalog for 0.9pl27 (thans to Francisco Javier
  Comer´┐Żn Gayoso)
* rewriten limiting conditions checking engine again
* implemented JavaScript bindings to enable users to use more flexible
  conditions for excluding URLs from download (new option -js_script_file)
* implemented new function "jsf" for -fnrules option which allows
  execution of JavaScript functions by name

Best regards,
Stevo.

-- 
Stefan Ondrejicka <ondrej@idata.sk>
Beethovenova 11, 917 08 Trnava, Slovakia
http://www.idata.sk/~ondrej/


------------------------ Yahoo! Groups Sponsor ---------------------~-->
Secure your servers with 128-bit SSL encryption! Grab your copy of
VeriSign's FREE Guide "Securing Your Web Site for Business." Get it now!
http://www.verisign.com/cgi-bin/go.cgi?a=n094442340008000
http://us.click.yahoo.com/6lIgYB/IWxCAA/yigFAA/CefplB/TM
---------------------------------------------------------------------~->

 

Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/