File: check_whitelist

package info (click to toggle)
spamassassin 3.1.7-2
  • links: PTS
  • area: main
  • in suites: etch-m68k
  • size: 5,376 kB
  • ctags: 2,123
  • sloc: perl: 39,706; ansic: 3,133; sh: 1,344; sql: 170; makefile: 168
file content (128 lines) | stat: -rwxr-xr-x 2,870 bytes parent folder | download | duplicates (3)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
#!/usr/bin/perl
#
# TODO: should this be made a top-level script, called "sa-awl"?

sub usage {
  die "
usage: check_whitelist [--clean] [--min n] [dbfile]
";
}

use strict;
use Fcntl;
use Getopt::Long;

use vars qw(
		$opt_clean $opt_min $opt_help
	);

GetOptions(
  'clean'		=> \$opt_clean,
  'min:i'		=> \$opt_min,
  'help'		=> \$opt_help
) or usage();
$opt_help and usage();

$opt_min ||= 2;

BEGIN { @AnyDBM_File::ISA = qw(DB_File GDBM_File NDBM_File SDBM_File); }
use AnyDBM_File ;

my $db;
if ($#ARGV == -1) {
  $db = $ENV{HOME}."/.spamassassin/auto-whitelist";
} else {
  $db = $ARGV[0];
}

my %h;
if ($opt_clean) {
  tie %h, "AnyDBM_File",$db, O_RDWR,0600
      or die "Cannot open r/w file $db: $!\n";
} else {
  tie %h, "AnyDBM_File",$db, O_RDONLY,0600
      or die "Cannot open file $db: $!\n";
}

my @k = grep(!/totscore$/,keys(%h));
for my $key (@k)
{
  my $totscore = $h{"$key|totscore"};
  my $count = $h{$key};
  next unless defined($totscore);

  if ($opt_clean) {
    if ($count >= $opt_min) { next; }
    print "cleaning: ";
  }

  printf "% 8.1f %15s  --  %s\n",
		  $totscore/$count, (sprintf "(%.1f/%d)",$totscore,$count),
		  $key;

  if ($opt_clean) {
    delete $h{"$key|totscore"};
    delete $h{$key};
  }
}
untie %h;

=head1 NAME

check_whitelist - examine and manipulate SpamAssassin's auto-whitelist db

=head1 SYNOPSIS

B<check_whitelist> [--clean] [--min n] [dbfile]

=head1 DESCRIPTION

Check or clean a SpamAssassin auto-whitelist (AWL) database file.

The name of the file is specified after any options, as C<dbfile>.
The default is C<$HOME/.spamassassin/auto-whitelist>.

=head1 OPTIONS

=over 4

=item --clean

Clean out infrequently-used AWL entries.  The C<--min> switch can be
used to select the threshold at which entries are kept or deleted.

=item --min n

Select the threshold at which entries are kept or deleted when C<--clean> is
used.  The default is C<2>, so entries that have only been seen once are
deleted.

=back

=head1 OUTPUT

The output looks like this:

     AVG  (TOTSCORE/COUNT)  --  EMAIL|ip=IPBASE

For example:

     0.0         (0.0/7)  --  dawson@example.com|ip=208.192
    21.8        (43.7/2)  --  mcdaniel_2s2000@example.com|ip=200.106

C<AVG> is the average score;  C<TOTSCORE> is the total score of all mails seen
so far;  C<COUNT> is the number of messages seen from that sender;  C<EMAIL> is
the sender's email address, and C<IPBASE> is the B<AWL base IP address>.

B<AWL base IP address> is a way to identify the sender's IP address they
frequently send from, in an approximate way, but remaining hard for spammers to
spoof.  The algorithm is as follows:

  - take the last Received header that contains a public IP address -- namely
    one which is not in private, unrouted IP space.

  - chop off the last two octets, assuming that the user may be in an ISP's
    dynamic address pool.

=cut