File: scanYahoo.py

package info (click to toggle)

pyparsing 2.1.10%2Bdfsg1-1

links: PTS, VCS
area: main
in suites: stretch
size: 5,412 kB
ctags: 8,284
sloc: python: 11,929; sh: 17; makefile: 7

file content (14 lines) | stat: -rw-r--r-- 452 bytes

parent folder | download | duplicates (3)

from pyparsing import makeHTMLTags,SkipTo,htmlComment
import urllib.request, urllib.parse, urllib.error

serverListPage = urllib.request.urlopen( "http://www.yahoo.com" )
htmlText = serverListPage.read()
serverListPage.close()

aStart,aEnd = makeHTMLTags("A")

link = aStart + SkipTo(aEnd).setResultsName("link") + aEnd
link.ignore(htmlComment)

for toks,start,end in link.scanString(htmlText):
    print(toks.link, "->", toks.startA.href)