1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304
|
= Mechanize Release Notes
== 0.6.3 (Big Man)
Mechanize 0.6.3 (Big Man) has a few bug fixes and some new features added to
the Form class. The Form class is now more hash like. I've added an
Form#add_field! method that will add a field to your form. Form#[]= will now
add a field if the key doesn't exist. For example, your form doesn't have
an input field named 'foo', the following 2 lines of code are equivalent, and
will create a field named 'foo':
form['foo'] = 'bar'
or
form.add_field!('foo', 'bar')
To make forms more hashlike, has_value?, and has_key? methods.
== 0.6.2 (Bridget)
Mechanize 0.6.2 (Bridget) is a fairly small bug fix release. You can now
access the parsed page when a ResponseCodeError is thrown. For example, this
loads a page that doesn't exist, but gives you access to the parsed 404 page:
begin
WWW::Mechanize.new().get('http://google.com/asdfasdfadsf.html')
rescue WWW::Mechanize::ResponseCodeError => ex
puts ex.page
end
Accessing forms is now more DSL like. When manipulating a form, for example,
you can use the following syntax:
page.form('formname') { |form|
form.first_name = "Aaron"
}.submit
Documentation has also been updated thanks to Paul Smith.
== 0.6.1 (Chuck)
Mechanize version 0.6.1 (Chuck) is done, and is ready for you to use. This
post "my trip to europe" release includes many bug fixes and a handful of
new features.
New features include, a submit method on forms, a click method on links, and an
REXML pluggable parser. Now you can submit a form just by calling a method on
the form, rather than passing the form to the submit method on the mech object.
The click method on links lets you click the link by calling a method on the
link rather than passing the link to the click method on the mech object.
Lastly, the REXML pluggable parser lets you use your pre-0.6.0 code with
0.6.1. See the CHANGELOG for more details.
== 0.6.0 (Rufus)
WWW::Mechanize 0.6.0 aka Rufus is ready! This hpricot flavored pie has
finished cooling on the window sill and is ready for you to eat. But if you
don't want to eat it, you can just download it and use it. I would
understand that.
The best new feature in this release in my opinion is the hpricot flavoring
packed inside. Mechanize now uses hpricot as its html parser. This means
mechanize gets a huge speed boost, and you can use the power of hpricot for
scraping data. Page objects returned from mechanize will allow you to use
hpricot search methods:
agent.get('http://rubyforge.org').search("//strong")
or
agent.get('http://rubyforge.org')/"strong"
The click method on mechanize has been updated so that you can click on links
you find using hpricot methods:
agent.click (page/"a").first
Or click on frames:
agent.click (page/"frame").first
The cookie parser has been overhauled to be more RFC 2109 compliant and to
use WEBrick cookies. Dependencies on ruby-web and mime-types have been
removed in favor of using hpricot and WEBrick respectively.
attr_finder and REXML helper methods have been removed.
== 0.5.4 (Sylvester)
WWW::Mechanize 0.5.4 aka Sylvester is fresh out the the frying pan and in to
the fire! It is also ready for you to download and use.
New features include WWW::Mechanize#transact (thanks to Johan Kiviniemi) which
lets you maintain your history state between transactions. Forms can now be
accessed as a hash. For example, to set the value of an input field, you can
do the following:
form['name'] = "Aaron"
Doing this assumes that you are setting the first field. If there are multiple
fields with the same name, you must use a different method to set the value.
Form file uploads will now read the file specified by FileUpload#file_name.
The mime type will also be automatically determined for you! Take a look
at the EXAMPLES file for a new flickr upload script.
Lastly, gzip encoding is now supported! WWW::Mechanize now supports pages
being sent gzip encoded. This means less network bandwidth. Yay!
== 0.5.3 (Twan)
Here it is. Mechanize 0.5.3 also named the "Twan" release. There are a few
new features, a few fixed bugs, and some other stuff too!
First, new features. WWW::Mechanize#click has been updated to operate on the
first link if an array is passed in. How is this helpful? It allows
syntax like this:
agent.click page.links.first
to be like this:
agent.click page.links
This trick was actually implemented in WWW::Mechanize::List. If you send a
method to WWW::Mechanize::List, and it doesn't know how to respond, it will
try calling that method on the first element of the list. But it only does
that for methods with no arguments.
Radio buttons, check boxes, and select lists can now be ticked, unticked, and
clicked. Now to select the second radio button from a list, you can do this:
form.radiobuttons.name('color')[1].click
Mechanize will handle unchecking all of the other radio buttons with the same
name.
Pretty printing has been added so that inspecting mechanize objects is very
pretty. Go ahead and try it out!
pp page
Or even
pp page.forms.first
Now, bugfixes. A bug was fixed when spaces are passed in as part of the URL
to WWW::Mechanize#get. Thanks to Eric Kolve, a bug was fixed with methods
that conflict with rails. Thanks to Yinon Bentor for sending in a patch to
improve Log4r support and a slight speed increase.
== 0.5.2
This release comes with a few cool new features. First, support for select
lists which are "multi" has been added. This means that you can select
multiple values for a select list that has the "multiple" attribute. See
WWW::Mechanize::MultiSelectList for more information.
New methods for select lists have been added. You can use the select_all
method to select all options and select_none to select none to select no
options. Options can now be "selected" which selects an option, "unselected",
which unselects an option, and "clicked" which toggles the current status of
the option. What this means is that instead of having to select the first
option like this:
select_list.value = select_list.options.first.value
You can select the first option by just saying this:
select_list.options.first.select
Of course you can still set the select list to an arbitrary value by just
setting the value of the select list.
A new method has been added to Form so that multiple fields can be set at the
same time. To set 'foo', and 'name' at the same time on the form, you can do
the following:
form.set_fields( :foo => 'bar', :name => 'Aaron' )
Or to set the second fields named 'name' you can do the following:
form.set_fields( :name => ['Aaron', 1] )
Finally, attr_finder has been deprecated, and all syntax like this:
@agent.links(:text => 'foo')
needs to be changed to:
@agent.links.text('foo')
With this release you will just get a warning, and the code will be removed in
0.6.0.
== 0.5.1
This release is a small bugfix release. The main bug fixed in this release is
a problem with file uploading. I have also made some performance improvements
to cookie parsing.
== 0.5.0
Good News first:
This release has many new great features! Mechanize has been updated to
handle any content type a web server returns using a system called "Pluggable
Parsers". Mechanize has always been able to handle any content type
(sort of), but the pluggable parser system lets us cleanly handle any
content type by instantiating a class for the content type returned from the
server. For example, a web server returns type 'text/html', mechanize asks
the pluggable parser for a class to instantiate for 'text/html'. Mechanize
then instantiates that class and returns it. Users can define their own
parsers, and register them with the Pluggable Parser so that mechanize will
instantiate your class when the content type you specify is returned. This
allows you to easily preprocess your HTML, or even use other HTML parsers.
Content types that the pluggable parser doesn't know how to handle will
return WWW::Mechanize::File which has basic functionality like a 'save_as'
method. For more information, see the RDoc for
WWW::Mechanize::PluggableParser also see the EXAMPLES file.
A 'save_as' method has been added so that any page downloaded can be easily
saved to a file.
The cookie jar for mechanize can now be saved to disk and loaded back up at
another time. If your script needs to save cookie state between executions,
you can now use the 'save_as' and 'load' methods on WWW::Mechanize::CookieJar.
Form fields can now be treated as accessors. This means that if you have a
form with the fields 'username' and 'password', you could manipulate them like
this:
form.username = 'test'
form.password = 'testing'
puts "username: #{form.username}"
puts "password: #{form.password}"
Form fields can still be accessed in the usual way in case there are multiple
input fields with the same name.
Bad news second:
In this release, the name space has been altered to be more consistent. Many
classes used to be under WWW directly, they are now all under WWW::Mechanize.
For example, in 0.4.7 Page was WWW::Page, in this release it is now
WWW::Mechanize::Page. This may break your code, but if you aren't using
class names directly, everything should be fine.
Body filters have been removed in favor of Pluggable Parsers.
== 0.4.7
This release of mechanize comes with a few bug fixes including fixing a
bug when there is no action specified in a form.
In this version, a default user agent string is now set for mechanize. Also
a convenience method WWW::Mechanize#get_file has been added for fetching
non text/html files.
== 0.4.6
The 0.4.6 release comes with proxy support which can be enabled by calling
the set_proxy method on your WWW::Mechanize object. Once you have set your
proxy settings, all mechanize requests will go through the proxy.
A new "visited?" method has been added to WWW::Mechanize so that you can see
if any particular URL is in your history.
Image alt text support has been added to links. If a link contains an image
with no text, the alt text of the image will be used. For example:
<a href="foo.html><img src="foo.gif" alt="Foo Image"></a>
This link will contain the text "Foo Image", and can be found like this:
link = page.links.text('Foo Image')
Lists of things have been updated so that you can set a value without
specifying the position in the array. It will just assume that you want to
set the value on the first element. For example, the following two statements
are equivalent:
form.fields.name('q').first.value = 'xyz' # Old syntax
form.fields.name('q').value = 'xyz' # New syntax
This new syntax comes with a note of caution; make sure you know you want to
set only the first value. There could be multiple fields with the name 'q'.
== 0.4.5
This release comes with a new filtering system. You can now manipulate the
response body before mechanize parses it. This can be useful if you know that
the HTML you need to parse is broken, or if you want to speed up the parsing.
This filter can be done on a global basis, or on a per page basis. Check out
the new examples in the EXAMPLES file for usage.
This release is also starting to phase out the misspelled method
WWW::Mechanize#basic_authetication. If you are using that method, please
switch to WWW::Mechanize#basic_auth.
The 0.4.5 release has many bug fixes, most noteably better cookie parsing and
better form support.
== 0.4.4
This release of mechanize comes with a new "Option" object that can be
accessed from select fields on forms. That means that you can figure out
what option to set based on the text in the select field. For example:
selectlist = form.fields.name('selectlist').first
selectlist.value = selectlist.options.find { |o| o.text == 'foo'}.value
== 0.4.3
The new syntax for finding things like forms, fields, frames, etcetera looks
like this:
page.links.with.text 'Some Text'
The preceding statement will find all links in a page with the text
'Some Text'. This can be applied to form fields as well:
form.fields.with.name 'email'
These can be chained as well like this:
form.fields.with.name('email').and.with.value('blah@domain.com')
'with' and 'and' can be omitted, and the old way is still supported. The
following statements all do the same thing:
form.fields.find_all { |f| f.name == 'email' }
form.fields.with.name('email')
form.fields.name('email')
form.fields(:name => 'email')
Regular expressions are also supported:
form.fields.with.name(/email/)
|