1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 598 599 600 601 602 603 604 605 606 607 608 609 610 611 612 613 614 615 616 617 618 619 620 621 622 623 624 625 626 627 628 629 630 631 632 633 634 635 636 637 638 639 640 641 642 643 644 645 646 647 648 649 650 651 652 653 654 655 656 657 658 659 660 661 662 663 664 665 666 667 668 669 670 671 672 673 674 675 676 677 678 679 680 681 682 683 684 685 686 687 688 689 690 691 692 693 694 695 696 697 698 699 700 701 702 703 704 705 706 707 708 709 710 711 712 713 714 715 716 717 718 719 720 721 722 723 724 725 726 727 728 729 730 731 732 733 734 735 736 737 738 739 740 741 742 743 744 745 746 747 748 749 750 751 752 753 754 755 756 757 758 759 760 761 762 763 764 765 766 767 768 769 770 771 772 773 774 775 776 777 778 779 780 781 782 783 784 785 786 787 788 789 790 791 792 793 794 795 796 797 798 799 800 801 802 803 804 805 806 807 808 809 810 811 812 813 814 815 816 817 818 819 820 821 822 823 824 825 826 827 828 829 830 831 832 833 834 835 836 837 838 839 840 841 842 843 844 845 846 847 848 849 850 851 852 853 854 855 856 857 858 859 860 861 862 863 864 865 866 867 868 869 870 871 872 873 874 875 876 877 878 879 880 881 882 883 884 885 886 887 888 889 890
|
# ********************************************************************
#
# File : $Source: /cvsroot/ijbswa/current/default.filter,v $
#
# $Id: default.filter,v 1.86 2013/02/19 11:14:47 fabiankeil Exp $
#
# Purpose : Rules to process the content of web pages
#
# Copyright : Written by and Copyright (C) 2001-2010 the
# Privoxy team. http://www.privoxy.org/
#
# We value your feedback. However, to provide you with the best support,
# please note:
#
# * Use the support forum to get help:
# http://sourceforge.net/tracker/?group_id=11118&atid=211118
# * Submit bugs only thru our bug forum:
# http://sourceforge.net/tracker/?group_id=11118&atid=111118
# Make sure that the bug has not already been submitted. Please try
# to verify that it is a Privoxy bug, and not a browser or site
# bug first. If you are using your own custom configuration, please
# try the stock configs to see if the problem is a configuration
# related bug. And if not using the latest development snapshot,
# please try the latest one. Or even better, CVS sources.
# * Submit feature requests only thru our feature request forum:
# http://sourceforge.net/tracker/?atid=361118&group_id=11118&func=browse
#
# For any other issues, feel free to use the mailing lists:
# http://sourceforge.net/mail/?group_id=11118
#
# Anyone interested in actively participating in development and related
# discussions can join the appropriate mailing list here:
# http://sourceforge.net/mail/?group_id=11118. Archives are available
# here too.
#
#################################################################################
#
# Syntax:
#
# Generally filters start with a line like "FILTER: name description".
# They are then referrable from the actionsfile with +filter{name}
#
# FILTER marks a filter as content filter, other filter
# types are CLIENT-HEADER-FILTER, CLIENT-HEADER-TAGGER,
# SERVER-HEADER-FILTER and SERVER-HEADER-TAGGER.
#
# Inside the filters, write one Perl-Style substitution (job) per line.
# Jobs that precede the first FILTER: line are ignored.
#
# For Details see the pcrs manpage contained in this distribution.
# (and the perlre, perlop and pcre manpages)
#
# Note that you are free to choose the delimiter as you see fit.
#
# Note2: In addition to the Perl options gimsx, the following nonstandard
# options are supported:
#
# 'U' turns the default to ungreedy matching. Add ? to quantifiers to
# switch back to greedy.
#
# 'T' (trivial) prevents parsing for backreferences in the substitute.
# Use if you want to include text like '$&' in your substitute without
# quoting.
#
# 'D' (Dynamic) allows the use of variables. Supported variables are:
# $host, $origin (the IP address the request came from), $path and $url.
#
# Note that '$' is a bad choice as delimiter for dynamic filters as you
# might end up with unintended variables if you use a variable name
# directly after the delimiter. Variables will be resolved without
# escaping anything, therefore you also have to be careful not to chose
# delimiters that appear in the replacement text. For example '<' should
# be save, while '?' will sooner or later cause conflicts with $url.
#
#################################################################################
#################################################################################
#
# js-annoyances: Get rid of particularly annoying JavaScript abuse.
#
#################################################################################
FILTER: js-annoyances Get rid of particularly annoying JavaScript abuse.
# Note: Most of these jobs would be safer if restricted to a
# <script> context as in:
#
# s/(<script.*)nasty-item(?=.*<\/script>)/$1replacement/sigU
#
# but that would make them match only the first occurrence of
# nasty-item in each <script>. We need nestable jobs!
# Get rid of Javascript referrer tracking.
# Test page: http://www.javascript-page.com/referrer.html
#
s|(?:\w+\.)+referrer|"Not Your Business!"|gisU
# The status bar is for displaying link targets, not pointless blahblah
#
s@([\W]\s*)((?:this|window)\.(?:default)?status)\s*=\s*((['"]).*?\4)@$1$2 =\
(typeof(this.href) != 'undefined')?($3 + ' URL: ' + this.href):($2)@ig
s/(?:(?:this|window)\.(?:default)?status)\s*=\s*\w*\s*;//ig
# Kill OnUnload popups. Yummy.
# Test: http://www.zdnet.com/zdsubs/yahoo/tree/yfs.html
#
s/(<body\s+[^>]*)onunload/$1never/siU
s|(<script.*)window\.onunload(?=.*</script>)|$1never|sigU
# If we allow window.open, we want normal window features:
# Test: http://www.htmlgoodies.com/beyond/notitle.html
#
s/(open\s*\([^\)]+resizable=)(["']?)(?:no|0)\2/$1$2yes$2/sigU
s/(open\s*\([^\)]+location=)(["']?)(?:no|0)\2/$1$2yes$2/sigU
s/(open\s*\([^\)]+status=)(["']?)(?:no|0)\2/$1$2yes$2/sigU
s/(open\s*\([^\)]+scroll(?:ing|bars)=)(["']?)(?:no|0)\2/$1$2auto$2/sigU
s/(open\s*\([^\)]+menubar=)(["']?)(?:no|0)\2/$1$2yes$2/sigU
s/(open\s*\([^\)]+toolbar=)(["']?)(?:no|0)\2/$1$2yes$2/sigU
s/(open\s*\([^\)]+directories=)(["']?)(?:no|0)\2/$1$2yes$2/sigU
s/(open\s*\([^\)]+fullscreen=)(["']?)(?:yes|1)\2/$1$2no$2/sigU
s/(open\s*\([^\)]+always(?:raised|lowered)=)(["']?)(?:yes|1)\2/$1$2no$2/sigU
s/(open\s*\([^\)]+z-?lock=)(["']?)(?:yes|1)\2/$1$2no$2/sigU
s/(open\s*\([^\)]+hotkeys=)(["']?)(?:yes|1)\2/$1$2no$2/sigU
s/(open\s*\([^\)]+titlebar=)(["']?)(?:no|0)\2/$1$2yes$2/sigU
s/(open\s*\([^\)]+always(?:raised|lowered)=)(["']?)(?:yes|1)\2/$1$2no$2/sigU
#################################################################################
#
# js-events: Kill JavaScript event bindings and timers (Radically destructive! Only for extra nasty sites).
#
#################################################################################
FILTER: js-events Kill JavaScript event bindings and timers (Radically destructive! Only for extra nasty sites).
s/(on|event\.)((mouse(over|out|down|up|move))|(un)?load|contextmenu|selectstart)/never/ig
# Not events, but abused on the same type of sites:
s/(alert|confirm)\s*\(/concat(/ig
s/set(timeout|interval)\(/concat(/ig
#################################################################################
#
# html-annoyances: Get rid of particularly annoying HTML abuse.
#
#################################################################################
FILTER: html-annoyances Get rid of particularly annoying HTML abuse.
# New browser windows (if allowed -- see no-popups filter below) should be
# resizeable and have a location and status bar
#
s/(<a\s+href[^>]+resizable=)(['"]?)(?:no|0)\2/$1$2yes$2/igU
s/(<a\s+href[^>]+location=)(['"]?)(?:no|0)\2/$1$2yes$2/igU
s/(<a\s+href[^>]+status=)(['"]?)(?:no|0)\2/$1$2yes1$2/igU
s/(<a\s+href[^>]+scrolling=)(['"]?)(?:no|0)\2/$1$2auto$2/igU
s/(<a\s+href[^>]+menubar=)(['"]?)(?:no|0)\2/$1$2yes$2/igU
# The <BLINK> and <MARQUEE> tags were crimes!
#
s-</?(blink|marquee).*>--sigU
#################################################################################
#
# content-cookies: Kill cookies that come in the HTML or JS content.
#
#################################################################################
FILTER: content-cookies Kill cookies that come in the HTML or JS content.
# JS cookies, except those used by antiadbuster.com to detect us:
#
s|(\w+\.)+cookie(?=[ \t\r\n]*=)(?!='aab)|ZappedCookie|ig
# HTML cookies:
#
s|<meta\s+http-equiv=['"]?set-cookie.*>|<!-- ZappedCookie -->|igU
#################################################################################
#
# refresh-tags: Kill automatic refresh tags if refresh time is larger than 9 seconds.
#
#################################################################################
FILTER: refresh-tags Kill automatic refresh tags if refresh time is larger than 9 seconds.
# Note: Only deactivates refreshes with more than 9 seconds delay to
# preserve monster-stupid but common redirections via meta tags.
#
s@<meta\s+http-equiv\s*=\s*(['"]?)refresh\1\s+content\s*=\s*(['"]?)\d{2,}\s*(;(?:\s*url\s*=\s*)?([^>\2]*))?\2@<link rev="x-refresh" href="$4"@ig
#################################################################################
#
# unsolicited-popups: Disable unsolicited pop-up windows.
#
#################################################################################
FILTER: unsolicited-popups Disable only unsolicited pop-up windows.
s+([^'"]\s*<head.*>)(?=\s*[^'"])+$1<script>function PrivoxyWindowOpen(){return(null);}</script>+isU
s@([^\w\s.]\s*)((?:map)?(window|this|parent)\.?)?open\s*\(@$1PrivoxyWindowOpen(@ig
s+([^'"]\s*</html>)(?!\s*(\\n|'|"))+$1<script>function PrivoxyWindowOpen(a, b, c){return(window.open(a, b, c));}</script>+iU
##################################################################################
#
# all-popups: Kill all popups in JavaScript and HTML.
#
#################################################################################
FILTER: all-popups Kill all popups in JavaScript and HTML.
s@((\W\s*)(?:map)?(window|this|parent)\.?)open\s*\\?\(@$1concat(@ig # JavaScript
#s/\starget\s*=\s*(['"]?)_?(blank|new)\1?/ notarget/ig # HTML
s/\starget\s*=\s*(['"]?)_?(blank|new)\1?/ /ig # (X)HTML
##################################################################################
#
# img-reorder: Reorder attributes in <img> tags to make the banners-by-* filters more effective.
#
#################################################################################
FILTER: img-reorder Reorder attributes in <img> tags to make the banners-by-* filters more effective.
# In the first step src is moved to the start, then width is moved to the second
# place to guarantee an order of src, width, height. Also does some white-space
# normalization.
#
# This makes banners-by-size more effective and allows both banners-by-size
# and banners-by-link to preserve the original image URL in the title attribute.
s|<img\s+?([^>]*)\ssrc\s*=\s*(['"])([^>\\\2]+)\2|<img src=$2$3$2 $1|siUg
s|<img\s+?([^>]*)\ssrc\s*=\s*([^'">\\\s]+)|<img src=$2 $1|sig
s|(<img[^>]+height)\s*=\s*|$1=|sig
s|<img (src=(?:(['"])[^>\\\\2]*\2\|[^'">\\\s]+?))([^>]*)\s+width\s*=\s*((["']?)\d+?\5)(?=[\s>])|<img $1 width=$4$3|siUg
#################################################################################
#
# banners-by-size: Kill banners by size.
#
#################################################################################
#
# Standard banner sizes taken from http://www.iab.net/iab_banner_standards/bannersizes.html
#
# Note: Use http://config.privoxy.org/send-banner?type=trans for a transparent 1x1 image
# Use http://config.privoxy.org/send-banner?type=pattern for a grey/white pattern image
# Use http://config.privoxy.org/send-banner?type=auto to auto-select.
#
# Note2: Use img-reorder before this filter to ensure maximum matching success
#
#################################################################################
FILTER: banners-by-size Kill banners by size.
# 88*31
s@<img\s+(?:src\s*=\s*(['"]?)([^>\\\1\s]+)\1)?[^>]*?(width=(['"]?)88\4)[^>]*?(height=(['"]?)31\6)[^>]*?(?=/?>)@\
<img src="http://config.privoxy.org/send-banner?type=auto" border="0" title="Killed-$2-by-size" $3 $5@sig
# 120*60, 120*90, 120*240, 120*600
s@<img\s+(?:src\s*=\s*(['"]?)([^>\\\1\s]+)\1)?[^>]*?(width=(['"]?)120\4)[^>]*?(height=(['"]?)(?:600?|90|240)\6)[^>]*?(?=/?>)@\
<img src="http://config.privoxy.org/send-banner?type=auto" border="0" title="Killed-$2-by-size" $3 $5@sig
# 125*125
s@<img\s+(?:src\s*=\s*(['"]?)([^>\\\1\s]+)\1)?[^>]*?(width=(['"]?)125\4)[^>]*?(height=(['"]?)125\6)[^>]*?(?=/?>)@\
<img src="http://config.privoxy.org/send-banner?type=auto" border="0" title="Killed-$2-by-size" $3 $5@sig
# 160*600
s@<img\s+(?:src\s*=\s*(['"]?)([^>\\\1\s]+)\1)?[^>]*?(width=(['"]?)160\4)[^>]*?(height=(['"]?)600\6)[^>]*?(?=/?>)@\
<img src="http://config.privoxy.org/send-banner?type=auto" border="0" title="Killed-$2-by-size" $3 $5@sig
# 180*150
s@<img\s+(?:src\s*=\s*(['"]?)([^>\\\1\s]+)\1)?[^>]*?(width=(['"]?)180\4)[^>]*?(height=(['"]?)150\6)[^>]*?(?=/?>)@\
<img src="http://config.privoxy.org/send-banner?type=auto" border="0" title="Killed-$2-by-size" $3 $5@sig
# 234*60, 468*60 (Most Banners!)
s@<img\s+(?:src\s*=\s*(['"]?)([^>\\\1\s]+)\1)?[^>]*?(width=(['"]?)(?:234|468)\4)[^>]*?(height=(['"]?)60\6)[^>]*?(?=/?>)@\
<img src="http://config.privoxy.org/send-banner?type=auto" border="0" title="Killed-$2-by-size" $3 $5@sig
# 240*400
s@<img\s+(?:src\s*=\s*(['"]?)([^>\\\1\s]+)\1)?[^>]*?(width=(['"]?)240\4)[^>]*?(height=(['"]?)400\6)[^>]*?(?=/?>)@\
<img src="http://config.privoxy.org/send-banner?type=auto" border="0" title="Killed-$2-by-size" $3 $5@sig
# 250*250, 300*250
s@<img\s+(?:src\s*=\s*(['"]?)([^>\\\1\s]+)\1)?[^>]*?(width=(['"]?)(?:250|300)\4)[^>]*?(height=(['"]?)250\6)[^>]*?(?=/?>)@\
<img src="http://config.privoxy.org/send-banner?type=auto" border="0" title="Killed-$2-by-size" $3 $5@sig
# 336*280
s@<img\s+(?:src\s*=\s*(['"]?)([^>\\\1\s]+)\1)?[^>]*?(width=(['"]?)336\4)[^>]*?(height=(['"]?)280\6)[^>]*?(?=/?>)@\
<img src="http://config.privoxy.org/send-banner?type=auto" border="0" title="Killed-$2-by-size" $3 $5@sig
# Note: 200*50 was also proposed, but it probably causes too much collateral damage:
#
#s@<img\s+(?:src\s*=\s*(['"]?)([^>\\\1\s]+)\1)?[^>]*?(width=(['"]?)200\4)[^>]*?(height=(['"]?)50\6)[^>]*?(?=/?>)@\
# <img src="http://config.privoxy.org/send-banner?type=auto" border="0" title="Killed-$2-by-size" $3 $5@sig
#################################################################################
#
# banners-by-link: Kill banners by their links to known clicktrackers (Experimental).
#
#################################################################################
FILTER: banners-by-link Kill banners by their links to known clicktrackers.
# Common case with width and height attributes:
#
s@<a\s+href\s*=\s*(['"]?)([^>\1\s]*?(?:\
adclick # See www.dn.se \
| advert # see dict.leo.org \
| atwola\.com/(?:link|redir) # see www.cnn.com \
| doubleclick\.net/jump/ # redirs for doublecklick.net ads \
| counter # common \
| (?<!&type=)tracker # (&type=tracker is used in sf's project statistics) \
| adlog\.pl # see sf.net \
)[^>\1\s]*)\1[^>]*>\s*<img\s+(?:src\s*=\s*(['"]?)([^>\\\3\s]+)\3)?[^>]*((?:width|height)\s*=\s*(['"]?)\d+?\6)[^>]*((?:width|height)\s*=\s*(['"]?)\d+?\8)[^>]*?(?=/?>)\
@<img $5 $7 src="http://config.privoxy.org/send-banner?type=auto" border="0" title="Killed $4 by link to $2"@sigx
# Rare case w/o explicit dimensions:
#
s@<a\s+href\s*=\s*(['"]?)([^>\1\s]*?(?:ad(?:click|vert)|atwola\.com/(?:link|redir)|doubleclick\.net/jump/|(?<!&type=)tracker|counter|adlog\.pl)[^>\1\s]*)\1[^>]*>\s*<img\s+(?:src\s*=\s*(['"]?)([^>\\\3\s]+)\3)?[^>]*?(?=/?>)@<img src="http://config.privoxy.org/send-banner?type=auto" border="0" title="Killed $4 by link to $2"@sig
################################################################################
#
# webbugs: Squish WebBugs (1x1 invisible GIFs used for user tracking).
#
#################################################################################
FILTER: webbugs Squish WebBugs (1x1 invisible GIFs used for user tracking).
s@<img[^>]*\s(?:width|height)\s*=\s*['"]?[01](?=\D)[^>]*\s(?:width|height)\s*=\s*['"]?[01](?=\D)[^>]*?>@@siUg
#################################################################################
#
# tiny-textforms: Extend those tiny textareas up to 40x80 and kill the hard wrap.
#
#################################################################################
FILTER: tiny-textforms Extend those tiny textareas up to 40x80 and kill the hard wrap.
s/(<textarea[^>]*?)(?:\s*(?:rows|cols)=(['"]?)\d+\2)+/$1 rows=$2\40$2 cols=$2\80$2/ig
s/(<textarea[^>]*?)wrap=(['"]?)hard\2/$1/ig
#################################################################################
#
# jumping-windows: Prevent windows from resizing and moving themselves.
#
#################################################################################
FILTER: jumping-windows Prevent windows from resizing and moving themselves.
s/(?<=[\W])(?:window|this|self)\.(?:move|resize)(?:to|by)\(/''.concat(/ig
#################################################################################
#
# frameset-borders: Give frames a border, make them resizable and scrollable.
#
#################################################################################
FILTER: frameset-borders Give frames a border and make them resizable.
s/(<frameset\s+[^>]*)framespacing=(['"]?)(no|0)\2/$1/igU
s/(<frameset\s+[^>]*)frameborder=(['"]?)(no|0)\2/$1/igU
s/(<frameset\s+[^>]*)border=(['"]?)(no|0)\2/$1/igU
s/(<frame\s+[^>]*)noresize/$1/igU
s/(<frame\s+[^>]*)frameborder=(['"]?)(no|0)\2/$1/igU
s/(<frame\s+[^>]*)scrolling=(['"]?)(no|0)\2/$1/igU
#################################################################################
#
# iframes: Remove all detected iframes. Should only be enabled for
# individual sites after testing that the iframes are optional.
#
#################################################################################
FILTER: iframes Removes all detected iframes. Should only be enabled for individual sites.
s@<iframe.*</iframe>@<!-- iframe removed by Privoxy's iframe filter -->@Uisg
#################################################################################
#
# demoronizer: Correct Microsoft's abuse of standardized character sets, which
# leave the browser to (mis)-interpret unknown characters, with
# sometimes bizarre results on non-MS platforms.
#
# credit: ripped from the demoroniser.pl script by:
# John Walker -- January 1998, http://www.fourmilab.ch/webtools/demoroniser
#
#################################################################################
FILTER: demoronizer Fix MS's non-standard use of standard charsets.
s/(&\#[0-2]\d\d)\s/$1; /g
# per Robert Lynch: http://slate.msn.com//?id=2067547, just a guess.
# Must come before x94 below.
s/\xE2\x80\x94/ -- /g
s/\x82/,/g
#s-\x83-<em>f</em>-g
s/\x84/,,/g
s/\x85/.../g
#s/\x88/^/g
#s-\x89- /-g
s/\x8B/</g
s/\x8C/Oe/g
s/\x91/`/g
s/\x92/'/g
s/(\x93|\x94)/"/g
# Bullet type character.
s/\x95/·/g
s/\x96/-/g
s/\x97/--/g
#s-\x98-<sup>~</sup>-g
#s-\x99-<sup>TM</sup>-g
# per Robert Lynch.
s/\x9B/>/g # 155
#################################################################################
#
# shockwave-flash: Kill embedded Shockwave Flash objects.
# Note: Better just block "/.*\.swf$"!
#
#################################################################################
FILTER: shockwave-flash Kill embedded Shockwave Flash objects.
s|<object [^>]*macromedia.*</object>|<!-- Squished Shockwave Object -->|sigU
s|<embed [^>]*(application/x-shockwave-flash\|\.swf).*>(.*</embed>)?|<!-- Squished Shockwave Flash Embed -->|sigU
#################################################################################
#
# quicktime-kioskmode: Make Quicktime movies saveable.
#
#################################################################################
FILTER: quicktime-kioskmode Make Quicktime movies saveable.
s/(<embed\s+[^>]*)kioskmode\s*=\s*(["']?)true\2/$1/ig
#################################################################################
#
# fun: Text replacements for subversive browsing fun!
#
#################################################################################
FILTER: fun Text replacements for subversive browsing fun!
# SCNR
#
s/microsoft(?!\.[^\s])/MicroSuck/ig
# Buzzword Bingo (example for extended regex syntax)
#
s* (?:industry|world)[ -]leading \
| cutting[ -]edge \
| customer[ -]focused \
| market[ -]driven \
| award[ -]winning # Comments are OK, too! \
| high[ -]performance \
| solutions[ -]based \
| unmatched \
| unparalleled \
| unrivalled \
*$0<sup><font color="red"><b>Bingo!</b></font></sup> \
*igx
# For Germans only
#
s/(M|m)edien(?![^<]*>)/$1ädchen/Ug
#################################################################################
#
# crude-parental: Crude parental filtering. Use with a suitable blocklist.
# Pages are "blocked" based on keyword matching.
#
#################################################################################
FILTER: crude-parental Crude parental filtering. Note that this filter doesn't work reliably.
# (Note: Middlesex, Sussex and Essex are counties in the UK, not rude words)
# (Note #2: Is 'sex' a rude word?!)
s%^.*(?<!middle)(?<!sus)(?<!es)sex.*$%<html><head><title>Blocked</title></head><body>\
<h3>Blocked by Privoxy's crude-parental filter due to possible adult content.</h3></body></html>%is
s+^.*warez.*$+<html><head><title>No Warez</title></head><body><h3>You're not searching for illegal stuff, are you?</h3></body></html>+is
# Remove by description
s/^.*\
(?:(suck|lick|tongue|rub|fuck|fingering|finger|chicks?)\s*)?\
(?:(her|your|my|hard|with|big|wet|tight|pink|hot|moist|young|teen)\s*)+\
(dicks?|penis|cocks?|balls?|tits?|pussy|cunt|clit|ass|mouth).*$\
/This page has been blocked by Privoxy's crude-parental content filter\
/is
#Remove by link text
s/^.*\
(download|broadband|view|watch|free|get|extreem)?\s*\
(sex|xxx|porn|cumshot|fuck(ing|s)?|anal|ass|asian|adult|Amateur|org(y|ies)|close ups?|hand?job|nail(ed)?)+\s*\
(movies?|pics?|videos?|dvds?|dvd's|links?).*$\
/This page has been blocked by Privoxy's crude-parental content filter\
/is
#Remove by age disclaimer
s/^.*\
(models?|chicks?|girls?|women|persons)\s*\
(who|are|were)+ (over|at least) (16|18|21) years (old|of age).*$\
/This page has been blocked by Privoxy's crude-parental content filter\
/is
#Remove by regulations
s/^.*(Section 2257|18 U.?S.?C.? 2257).*$\
/This page has been blocked by Privoxy's crude-parental content filter\
/is
#################################################################################
#
# IE-Exploits: Disable some known Internet Explorer bug exploits.
#
#################################################################################
FILTER: ie-exploits Disable some known Internet Explorer bug exploits.
# Note: This is basically a demo and waits for someone more interested in IE
# security (sic!) to take over.
# Cross-site-scripting:
#
s%f\("javascript:location.replace\('mk:@MSITStore:C:'\)"\);%alert\("This page looks like it tries to use a vulnerability described here:\n http://online.securityfocus.com/archive/1/298748/2002-11-02/2002-11-08/2"\);%siU
# Address bar spoofing (http://www.secunia.com/advisories/10395/):
#
s/(<a[^>]*href[^>]*)(?:\x01|\x02|\x03|%0[012])@/$1MALICIOUS-LINK@/ig
# Nimda:
#
s%<script language="JavaScript">(window\.open|1;''\.concat)\("readme\.eml", null, "resizable=no,top=6000,left=6000"\)</script>%<br><font size="7"> WARNING: This Server is infected with <a href="http://www.cert.org/advisories/CA-2001-26.html">Nimda</a>!</font>%g
#################################################################################
#
#
# site-specifics: Cure for site-specific problems. Don't apply generally!
#
# Note: The fixes contained here are so specific to the problems of the
# particular web sites they are designed for that they would be a
# waste of CPU cycles (or even destructive!) on 99.9% of the web
# sites where they don't apply.
#
#################################################################################
FILTER: site-specifics Cure for site-specific problems. Don't apply generally!
# www.spiegel.de excludes X11 users from viewing Flash5 objects - shame.
# Apply to: www.spiegel.de/static/js/flash-plugin.js
#
s/indexOf\("x11"\)/indexOf("x13")/
# www.quelle-bausparkasse.de uses a very stupid redirect mechanism that
# relies on a webbug being present. Can we tolerate that? No!
# Apply to: www.quelle-bausparkasse.de/$
#
s/mylogfunc()//g
# groups.yahoo.com has splash pages that one needs to click through in
# order to access the actual messages. Let the browser do that. Thanks
# to Paul Jobson for this one:
#
s|<a href="(.+?)">(?:Continue to message\|Weiter zu Nachricht)</a>|<meta http-equiv="refresh" content="0; URL=$1">|ig
# monster.com has two very similar gimmicks:
#
s|<input type="hidden" name="REDIRECT" value="(.+?)">|<meta http-equiv="refresh" content="0; URL=$1">|i
s|<IMG SRC="http://media.monster.com/mm/usen/my/no_thanks_211x40.gif".+?>|<meta http-equiv="refresh" content="0; URL=http://my.monster.com/resume.asp">|i
# nytimes.com triggers popups through the onload handler of dummy images
# to fool popup-blockers.
#
s|(<img [^>]*)onload|$1never|sig
# Pre-check all the "Discard" buttons in GNU Mailman's web interface.
# (This saves a lot of mouse aiming practice when flushing spamtraps)
#
s|(<INPUT name="\d{2,4}" type="RADIO" value="0") CHECKED |$1|g
s|<INPUT name="\d{2,4}" type="RADIO" value="3" |$0 checked|g
#################################################################################
#
# no-ping: Removes non-standard ping attributes in <a> and <area> tags.
#
#################################################################################
FILTER: no-ping Removes non-standard ping attributes in <a> and <area> tags.
s@(<a(?:rea)?[^>]*?)\sping=(['"]?)([^"'>]+)\2([>\s]?)@\
<strong style="color:white; background-color:red;" title="Privoxy removed ping target '$3'">PING!</strong>\n$1$4@ig
#################################################################################
#
# google: CSS-based block for Google text ads. Also removes
# a width limitation and the toolbar advertisement.
#
#################################################################################
FILTER: google CSS-based block for Google text ads. Also removes a width limitation and the toolbar advertisement.
s@</head>[^\\]@<style type="text/css">\n\
/* Style sheet inserted by Privoxy's google filter. */\n\
\#fbc, \#fbl, \#ra, .rhh {visibility: hidden !important;}\n\
\#tpa1,\#tpa2,\#tpa3,\#tpa4,\#tpa5,\#tpa5, \#spl, .ch, \#ads,\
\#toolbar, \#google_ads_frame, \#mbEnd {display: none !important;}\n\
.main_body, .j, \#res, .med, .hd, .g, .s\n\
{width: 99%; max-width: 100%; margin-left: 0; margin-right: 0;}\n\
</style>\n$0@
s@<div style=\"padding-top:11px;min-width:500px\">@<div id="main_body">@
s@(<table cellspacing=0 cellpadding=0 width=25% align=right bgcolor=\#ffffff border=0\
|</font></td></tr></tbody></table><table align=\"right\" bgcolor=\"\#ffffff\"\
|<table cellspacing=0 cellpadding=0 align=right bgcolor=\#ffffff border=0\
|<table style=\"clear:both\" align=right width=25% cellspacing=\"0\" cellpadding=\"0\"\
border=\"0\" bgcolor=\"\#ffffff\")@$0 id="ads"@
s@(<br clear=all><table)( border=0 cellpadding=9><tr><td)@$1 id="toolbar"$2@
#################################################################################
#
# yahoo: CSS-based block for Yahoo text ads. Also removes a width limitation.
#
#################################################################################
FILTER: yahoo CSS-based block for Yahoo text ads. Also removes a width limitation.
s@</head>@\n<style type="text/css">\n\
/* Style sheet inserted by Privoxy's yahoo filter. */\n\
\#symadbn, \#ymadbn, .yschbox, .yschhd, .bbox, \#yschsec, \#sec,\
\#yschanswr, .yschftad, .yschspn, .yschspns, \#ygrp-sponsored-links,\
\#nwad, \#MWA2, \#MSCM, \#yregad, \#sponsored-links,\
\#ks-ypn-ads, .ad, \#east, \#ygrp-vital, .ads {display: none !important;}\n\
\#yschpri, \#yschweb, \#pri, \#web, \#main, .yschttl, .abstr, .res \n\
{width: 99% !important; max-width: 100% !important;}\n\
.yschttl, .res, .res.indent, \#web {padding: 0px; margin: 0px !important;}\n\
\#web {padding-left: 0.5em}\n\
\#yschqcon, \#yschtg {width: auto !important; /* No useless horizontal scrollbar please */}\n\
\#composebox \#compose_editorArea {width: 70% !important; /* reasonably sized reply textarea please */\n\
</style>\n$0\n@
s@(<textarea\s+id="composeArea"[^>]*)width:545px;@$1width:70%;@isU
#################################################################################
#
# msn: CSS-based block for MSN text ads. Also removes tracking URLs
# and a width limitation.
#
#################################################################################
FILTER: msn CSS-based block for MSN text ads. Also removes tracking URLs and a width limitation.
s@</head>@<style type="text/css">\n\
/* Style sheet inserted by Privoxy's msn filter. */\n\
.msn_ads, \#at, \#ar, .mktmsg {display: none !important;}\n\
\#results, .flank, .results_area_flank, .results_area_stroke,\n\
\#results_area, \#content, .sb_tlst, .sa_cc, .sb_ph, \#sw_main,\n\
.content, \#sw_foot, \#bf, \#sw_content, \#sidebar, \#pag\n\
{width: 99% !important; min-width: 99% !important;\n\
max-width: 100% !important; /* width:100% sometimes causes horizontal scrollbars */}\n\
/* Remove "suggestions". They are next to worthless but partly overlap with the search results */\n\
.suggestion, \#nys_right, \#nys {clear: both; display:none;}\n\
\#s_notf_div,\n \
/* Overlay ads to enable Facebook 'likes' in search results. */\n\
.sn_container {display:none !important;}\n\
\#content {padding-right: 0;}\n\
</style>\n$0@
# Are these ids still in use?
s@(<div[^>]*) id=(["']?)ads_[^\2]*\2@$1 class="msn_ads"@Uig
s@(<div[^>]*) class=(["']?)sb_ads[^\2]*\2@$1 class="msn_ads"@Uig
s@(<a[^>]*href=\")http://g.msn.com/.*\?(http://.*)(&&DI=.*)(\")@$1$2$4@Ug
s@(<a[^>]*)gping=\".*\"@$1 title="URL cleaned up by Privoxy's msn filter"@Ug
#################################################################################
#
# blogspot: Cleans up some Blogspot blogs. Read the fine print before using this.
#
# This filter also intentionally removes some navigation stuff and
# sets the page width to 100%. As a result, some rounded "corners" would
# appear to early or not at all and as fixing this would require a browser
# that understands background-size (CSS3), they are removed instead.
#
# When applied to feeds, it removes comment titles that
# only contain the beginning of the actual comment.
#
#################################################################################
FILTER: blogspot Cleans up some Blogspot blogs. Read the fine print before using this.
s@</head>@<style type="text/css">\n\
/* Style sheet inserted by Privoxy's blogspot filter. */\n\
\#powered-by {display: none !important;}\n\
\#wrap4, \#wrapper {margin-top: 0px }\n\
\#blogheader, \#header {margin-top: 0.5em !important}\n\
\#content {width: 98% }\n\
\#main {width: 70% }\n\
\#sidebar {width: 29% }\n\
.post-body {overflow: auto;}\n\
.blogComments {width: 100%; overflow: auto;}\n</style>\n$0@
s@<body.*(?:<div id="space-for-ie"></div>|(<div id="(?:content|wrap4|wrapper)))@<body>\
<!-- Privoxy's blogspot filter ditched some garbage here -->$1@Us
s@(<div style=\"[^\"]*width:)30em@$1 100%@
s@background:url\(\"http://www.blogblog.com/rounders[^\"]*\"\).*;@/*$0*/@Ug
s@(background:\#[a-f\d]{3})( url\(\"http://www.blogblog.com/rounders[^\"]*\"\).*;)@$1 ;/*$2*/@Ug
# Do the feed filtering magic as described above.
s@<title(?:\s+type=\'text\')?>([^<]*)(?:\.\.\.)?\s*</title>\s*\
(<content(?:\s+type=\'(?:html|text)\')?>\s*\1)@<title></title>$2@ig
#################################################################################
#
# x-httpd-php-to-html: Changes the Content-Type header from
# x-httpd-php to html. "Content-Type: x-httpd-php"
# is set by clueless PHP users and causes many
# browsers do open a download menu instead of
# rendering the page.
#
#################################################################################
SERVER-HEADER-FILTER: x-httpd-php-to-html Changes the Content-Type header from x-httpd-php to html.
s@^(Content-Type:)\s*application/x-httpd-php@$1 text/html@i
#################################################################################
#
# html-to-xml: Changes the Content-Type header from html to xml.
#
#################################################################################
SERVER-HEADER-FILTER: html-to-xml Changes the Content-Type header from html to xml.
s@^(Content-Type:)\s*text/html(;.*)?$@$1 application/xhtml+xml$2@i
#################################################################################
#
# xml-to-html: Changes the Content-Type header from xml to html.
#
#################################################################################
SERVER-HEADER-FILTER: xml-to-html Changes the Content-Type header from xml to html.
s@^(Content-Type:)\s*(?:application|text)/(?:xhtml\+)?xml(;.*)?$@$1 text/html$2@i
#################################################################################
#
# hide-tor-exit-notation: Remove the Tor exit node notation in Host and Referer headers.
#
# Note: If Privoxy and Tor are chained and Privoxy is configured to
# use socks4a, one can use http://www.example.org.foobar.exit/
# to access the host www.example.org through Tor exit node foobar.
#
# As the HTTP client isn't aware of this notation, it treats the
# whole string "www.example.org.foobar.exit" as host and uses it
# for the "Host" and "Referer" headers. From the server's point of
# view the resulting headers are invalid and can cause problems.
#
# An invalid "Referer" header can trigger "hot-linking" protections,
# an invalid "Host" header will make it impossible for the server to
# find the right vhost (several domains hosted on the same IP address).
#
# This filter removes the "foo.exit" part in those headers
# to prevent the mentioned problems. Note that it only modifies
# the HTTP headers, it doesn't make it impossible for the server
# to detect your Tor exit node based on the IP address the request is
# coming from.
#
#################################################################################
CLIENT-HEADER-FILTER: hide-tor-exit-notation Removes the Tor exit node notation in Host and Referer headers.
s@^((?:Referer|Host):\s*(?:https?://)?[^/]*)\.[^\./]*?\.exit@$1@i
#################################################################################
#
# less-download-windows: Prevents annoying download windows for content types
# the browser can handle itself.
#
#################################################################################
SERVER-HEADER-FILTER: less-download-windows Prevent annoying download windows for content types the browser can handle itself.
s@^Content-Disposition:.*filename=(["']?).*\.(png|gif|jpe?g|diff?|d?patch|c|h|pl|shar)\1.*$@@i
s@^(Content-Type:)\s*(?:message/(?:news|rfc822)|text/x-.*|application/x-sh(?:\s|$))\s*@$1 text/plain@i
#################################################################################
#
# image-requests: Tags detected image requests as "IMAGE-REQUEST". Whether
# or not the detection actually works depends on the browser.
#
#################################################################################
CLIENT-HEADER-TAGGER: image-requests Tags detected image requests as "IMAGE-REQUEST".
s@^Accept:\s*image/.*@IMAGE-REQUEST@i
#################################################################################
#
# css-requests: Tags detected CSS requests as "CSS-REQUEST". Whether
# or not the detection actually works depends on the browser.
#
#################################################################################
CLIENT-HEADER-TAGGER: css-requests Tags detected CSS requests as "CSS-REQUEST".
s@^Accept:\s*text/css.*@CSS-REQUEST@i
#################################################################################
#
# range-requests: Tags range requests as "RANGE-REQUEST".
#
# By default Privoxy removes Range headers for requests to
# ressources that will be filtered to make sure the filters
# get the whole picture. Otherwise Range requests could be
# intentionally used to circumvent filters or, less likely,
# filtering a partial response may damage it because it matched
# a pattern that the ressource as a whole wouldn't.
#
# Range requests can be useful and save bandwidth so instead
# of removing Range headers for requests to ressources that
# will be filtered, you may prefer to simply disable filtering
# for those requests.
#
# That's what this tagger is all about. After enabling it,
# you can disable filtering for range requests using the following
# action section:
#
# {-filter -deanimate-gifs}
# TAG:^RANGE-REQUEST
#
#################################################################################
CLIENT-HEADER-TAGGER: range-requests Tags range requests as "RANGE-REQUEST".
s@^Range:.*@RANGE-REQUEST@i
#################################################################################
#
# client-ip-address: Tags the request with the client's IP address.
#
#################################################################################
CLIENT-HEADER-TAGGER: client-ip-address Tags the request with the client's IP address.
s@^\w*\s+.*\s+HTTP/\d\.\d\s*@IP-ADDRESS: $origin@D
#################################################################################
#
# http-method: Tags the request with its HTTP method.
#
#################################################################################
CLIENT-HEADER-TAGGER: http-method Tags the request with its HTTP method.
s@^(\w*).*HTTP/\d\.\d\s*$@$1@i
#################################################################################
#
# allow-post: Tags POST requests as "ALLOWED-POST".
#
#################################################################################
CLIENT-HEADER-TAGGER: allow-post Tags POST requests as "ALLOWED-POST".
s@^(?:POST)\s+.*\s+HTTP/\d\.\d\s*@ALLOWED-POST@i
#################################################################################
#
# complete-url: Tags the request with the whole request URL.
#
#################################################################################
CLIENT-HEADER-TAGGER: complete-url Tags the request with the whole request URL.
s@^\w*\s+(.*)\s+HTTP/\d\.\d\s*$@$1@i
#################################################################################
#
# user-agent: Tags the request with the complete User-Agent header.
#
#################################################################################
CLIENT-HEADER-TAGGER: user-agent Tags the request with the complete User-Agent header.
s@^User-Agent:.*@$0@i
#################################################################################
#
# referer: Tags the request with the complete Referer header.
#
#################################################################################
CLIENT-HEADER-TAGGER: referer Tags the request with the complete Referer header.
s@^Referer:.*@$0@i
#################################################################################
#
# content-type: Tags the request with the content type declared by the server.
#
#################################################################################
SERVER-HEADER-TAGGER: content-type Tags the request with the content type declared by the server.
s@^Content-Type:\s*([^;]+).*@$1@i
#################################################################################
#
# privoxy-control: The taggers create tags with the content of X-Privoxy-Control
# headers, the filters remove said headers.
#
#################################################################################
CLIENT-HEADER-TAGGER: privoxy-control Creates tags with the content of X-Privoxy-Control headers.
s@^X-Privoxy-Control:\s*@@i
CLIENT-HEADER-FILTER: privoxy-control Removes X-Privoxy-Control headers.
s@^X-Privoxy-Control:.*@@i
SERVER-HEADER-TAGGER: privoxy-control Creates tags with the content of X-Privoxy-Control headers.
s@^X-Privoxy-Control:\s*@@i
SERVER-HEADER-FILTER: privoxy-control Removes X-Privoxy-Control headers.
s@^X-Privoxy-Control:.*@@i
|