gregoa's blog: dynamic blacklisting with exim & postgresql

dynamic blacklisting with exim & postgresql

Motivation

If you run a mailserver you for sure know the more or less creative attempts of spammers to deliver to nonexistent email addresses: guessing local parts, changing (deliberately or by using broken scripts) harvested local parts, or just sending to random addresses. Some examples from yesterday:

2busenet-0402@comodo.priv.at (was gregor+usenet ...)
87.29.61.85@comodo.priv.at
a48ff091@comodo.priv.at
gregor.herrmannnn@comodo.priv.at (so many n's ...)
thanksgiving@comodo.priv.at

At some point we (as in "the spam & mail departments of the CUG", IOW: Bernd & me) thought that we actually don't want to accept mails from machines (mostly trojaned clients in botnets) which send to such random addresses (or to explicit spamtraps).
Thesis: a machine that send mails to this kind of addresses is not a legitimate MTA, probably has a very bad ham/spam ratio, & is better blocked at the RCPT TO stage in the first place.

Concept

After some attempts with scripts that ran daily against the maillog & created plain text blacklist for exim we came up with a solution that

is dynamic
works in realtime
only uses exim & postgresql

Postgresql

Add a user exim.
Add a database exim.

Our database looks like this:


CREATE TABLE cugrbl_host (
    ip inet NOT NULL
);

CREATE TABLE cugrbl_incident (
    id integer NOT NULL,
    ip inet NOT NULL,
    recipient character varying(128) NOT NULL,
    sender character varying(128) NOT NULL,
    datetime timestamp without time zone DEFAULT now() NOT NULL
);

CREATE SEQUENCE cugrbl_indicent_id_seq
    INCREMENT BY 1
    NO MAXVALUE
    NO MINVALUE
    CACHE 1;

CREATE TABLE cugrbl_pattern (
    id integer NOT NULL,
    condition character varying(256) NOT NULL
);

CREATE SEQUENCE cugrbl_pattern_id_seq
    INCREMENT BY 1
    NO MAXVALUE
    NO MINVALUE
    CACHE 1;

ALTER TABLE cugrbl_incident ALTER COLUMN id SET DEFAULT nextval('cugrbl_indicent_id_seq'::regclass);
ALTER TABLE cugrbl_pattern ALTER COLUMN id SET DEFAULT nextval('cugrbl_pattern_id_seq'::regclass);

ALTER TABLE ONLY cugrbl_host
    ADD CONSTRAINT cugrbl_host_pkey PRIMARY KEY (ip);

ALTER TABLE ONLY cugrbl_incident
    ADD CONSTRAINT cugrbl_indicent_pkey PRIMARY KEY (id);

ALTER TABLE ONLY cugrbl_pattern
    ADD CONSTRAINT cugrbl_pattern_pattern_key UNIQUE (condition);

ALTER TABLE ONLY cugrbl_pattern
    ADD CONSTRAINT cugrbl_pattern_pkey PRIMARY KEY (id);

CREATE INDEX cugrbl_incident_datetime_index ON cugrbl_incident USING btree (datetime);

You may want to tweak the length of cugrbl_incident.{recipient,sender} by looking up the allowed length of email addresses in RfC 2821 or by using your own experience/log files.

cugrbl_pattern.conditions contains email addresses like the ones mentioned above & has to be filled manually. The field is later used with the LIKE operator in a WHERE clause & may therefore contain the % wildcard character. — For those who don't like either psql or phppgadmin the following shell script makes inserting more convenient:

#!/bin/sh

for a in "$@"; do 
        echo "trying $a ..."
        echo "insert into cugrbl_pattern (condition) VALUES ('$a');" | \
        psql -h localhost -U exim exim
done

Exim

Exim can "talk" directly to the postgresql database, & in our setup it:

checks if the recipient address is in cugrbl_pattern.condition
if yes, writes the IP to cugrbl_host.ip
checks if the IP is in cugrbl_host.ip
if yes, writes a header & and optionally rejects the mail; logs to cugrbl_incident

First we set up the connection & define a few macros:

/etc/exim4/conf.d/main/10_exim4-config_cugrbl

# define our database connection
# local database accessed via a UNIX socket
hide pgsql_servers = (/var/run/postgresql/.s.PGSQL.5432)/exim/exim/PASSWORD
# local or remote database accessed via the network
# hide pgsql_servers = localhost/exim/exim/PASSWORD

SELECT_CONDITION = SELECT COUNT(condition) FROM cugrbl_pattern WHERE '${quote_pgsql:$local_part@$domain}' LIKE condition
SELECT_IP = SELECT COUNT(ip) FROM cugrbl_host WHERE ip='${quote_pgsql:$sender_host_address}'

INSERT_HOST = INSERT INTO cugrbl_host(ip) VALUES ('${quote_pgsql:$sender_host_address}')
INSERT_INCIDENT = INSERT INTO cugrbl_incident (ip, sender, recipient) VALUES ('${quote_pgsql:$sender_host_address}','${quote_pgsql:$sender_address}','${quote_pgsql:$local_part@$domain}')

Then comes the actual work of checking/writing/warning/rejecting:

/etc/exim4/conf.d/acl/30_exim4-config_check_rcpt

[..]

    # get variables
    warn set acl_m1 = ${lookup pgsql{SELECT_CONDITION}{$value}{2}}
    warn set acl_m2 = ${lookup pgsql{SELECT_IP}{$value}{2}}

    # cond & ip
    warn message = SELECT-CONDITION & SELECT-IP - Spamtrap-Entry: IP ($sender_host_address) already known, SMTP-Tokens added, see https://info.colgarra.priv.at/cugrbl/
       condition = ${if eq{$acl_m1}{1}}
       condition = ${if eq{$acl_m2}{1}}
       condition = ${lookup pgsql{INSERT_INCIDENT}{yes}{no}}

    # cond & ! ip
    warn message = SELECT-CONDITION & ! SELECT-IP - Spamtrap-Entry: IP ($sender_host_address) added to cUGrbl and SMTP-Tokens added, see https://info.colgarra.priv.at/cugrbl/
       condition = ${if eq{$acl_m1}{1}}
       condition = ${if eq{$acl_m2}{0}}
       condition = ${lookup pgsql{INSERT_HOST}{yes}{no}}
       condition = ${lookup pgsql{INSERT_INCIDENT}{yes}{no}}

    # check ip - warn
    warn message = X-Warning: $sender_host_address listed at CUGRBL (spamtrap), see https://info.colgarra.priv.at/cugrbl/
     log_message = $sender_host_address listed at CUGRBL (spamtrap), see https://info.colgarra.priv.at/cugrbl/
       condition = ${if eq{$acl_m2}{1}}

    # check ip - reject
    deny message = We don't accept mail from $sender_host_address, which is locally blacklisted due to a spamtrap. For further information contact postmaster@$domain or visit https://info.colgarra.priv.at/cugrbl/
       condition = ${if eq{$acl_m2}{1}}
        !senders = :
        domains  = CONFDIR/domains/deny_cugrbl

[..]

Please note that mails are only rejected for recipient domains that are listed in /etc/exim4/domains/deny_cugrbl in this setup. Mails to other domains get an X-Warning header.
Before this stanza mails are already accepted if they come from trusted machines, from IPs in dnswl.org, from authenticated senders, to postmaster, ...

Cleaning up

We don't want to fill the cugrbl_host table ad infinitum, & we want to give sending machines the chance to get out of the blacklist. The following perl script removes entries from this table after one month of no incident:

/etc/cron.daily/removeoldspamtrapips

#!/usr/bin/perl

use DBI;
use strict;
         
my $dbh = DBI->connect("dbi:Pg:dbname=exim", "exim", "PASSWORD");
my $sth = $dbh->do("DELETE FROM cugrbl_host WHERE ip NOT IN (SELECT DISTINCT ip from cugrbl_incident WHERE datetime > now() - interval '1 month');");

Experiences

It works :)
At the moment there are 34.398 unique IPs in cugrbl_host.
cugrbl_incident has 1.147.976 entries (since 2007-07-29).
{percentage of rejects ...}

gregoa, 2007-12-25, 2008-01-09, 2008-03-06
Thanks to Bernd for his feedback on this documentation.

All material on this blog — unless stated otherwise — is © gregor herrmann, and is licensed under the Creative Commons Attribution-Share Alike 3.0 Austria License.