File: EXAMPLES

package info (click to toggle)
libwww-mechanize-ruby 0.6.3-2
  • links: PTS
  • area: main
  • in suites: etch, etch-m68k
  • size: 660 kB
  • ctags: 723
  • sloc: ruby: 5,475; makefile: 5
file content (124 lines) | stat: -rw-r--r-- 3,168 bytes parent folder | download | duplicates (2)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
= WWW::Mechanize examples

== Google
  require 'rubygems'
  require 'mechanize'
  
  agent = WWW::Mechanize.new
  agent.user_agent_alias = 'Mac Safari'
  page = agent.get("http://www.google.com/")
  search_form = page.forms.with.name("f").first
  search_form.q = "Hello"
  search_results = agent.submit(search_form)
  puts search_results.body

== Rubyforge
  require 'mechanize'
  
  agent = WWW::Mechanize.new
  page = agent.get('http://rubyforge.org/')
  link = page.links.text(/Log In/)
  page = agent.click(link)
  form = page.forms[1]
  form.form_loginname = ARGV[0]
  form.form_pw = ARGV[1]
  page = agent.submit(form, form.buttons.first)
  
  puts page.body

== File Upload
This example uploads one image as two different images to flickr.

 require 'rubygems'
 require 'mechanize'
 
 agent = WWW::Mechanize.new
 
 # Get the flickr sign in page
 page  = agent.get('http://flickr.com/signin/flickr/')
 
 # Fill out the login form
 form  = page.forms.name('flickrloginform').first
 form.email = ARGV[0]
 form.password = ARGV[1]
 page  = agent.submit(form)
 
 # Go to the upload page
 page  = agent.click page.links.text('Upload')
 
 # Fill out the form
 form  = page.forms.action('/photos_upload_process.gne').first
 form.file_uploads.name('file1').first.file_name = ARGV[2]
 agent.submit(form)
  
== Pluggable Parsers
Lets say you want html pages to automatically be parsed with Rubyful Soup.
This example shows you how:

  require 'rubygems'
  require 'mechanize'
  require 'rubyful_soup'

  class SoupParser < WWW::Mechanize::Page
    attr_reader :soup
    def initialize(uri = nil, response = nil, body = nil, code = nil)
      @soup = BeautifulSoup.new(body)
      super(uri, response, body, code)
    end
  end

  agent = WWW::Mechanize.new
  agent.pluggable_parser.html = SoupParser

Now all HTML pages will be parsed with the SoupParser class, and automatically
give you access to a method called 'soup' where you can get access to the
Beautiful Soup for that page.

== Using a proxy

  require 'rubygems'
  require 'mechanize'
  
  agent = WWW::Mechanize.new
  agent.set_proxy('localhost', '8000')
  page = agent.get(ARGV[0])
  puts page.body

== The transact method

transact runs the given block and then resets the page history. I.e. after the
block has been executed, you're back at the original page; no need count how
many times to call the back method at the end of a loop (while accounting for
possible exceptions).

This example also demonstrates subclassing Mechanize.

  require 'mechanize'

  class TestMech < WWW::Mechanize
    def process
      get 'http://rubyforge.org/'
      search_form = page.forms.first
      search_form.words = 'WWW'
      submit search_form

      page.links.with.href( %r{/projects/} ).each do |link|
        next if link.href =~ %r{/projects/support/}

        puts 'Loading %-30s %s' % [link.href, link.text]
        begin
          transact do
            click link
            # Do stuff, maybe click more links.
          end
          # Now we're back at the original page.

        rescue => e
          $stderr.puts "#{e.class}: #{e.message}"
        end
      end
    end
  end

  TestMech.new.process