From: | Greg Smith <gsmith(at)gregsmith(dot)com> |
---|---|
To: | "Marc G(dot) Fournier" <scrappy(at)hub(dot)org> |
Cc: | "Joshua D(dot) Drake" <jd(at)commandprompt(dot)com>, Alvaro Herrera <alvherre(at)commandprompt(dot)com>, Dave Page <dpage(at)pgadmin(dot)org>, PostgreSQL WWW <pgsql-www(at)postgresql(dot)org>, sysadmins <sysadmins(at)postgresql(dot)org> |
Subject: | Re: List moderation - need a break! |
Date: | 2009-06-15 23:08:15 |
Message-ID: | alpine.GSO.2.01.0906151830450.21975@westnet.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-www |
On Mon, 15 Jun 2009, Marc G. Fournier wrote:
> I've tried unsucessfully in the past to do language based regex's and failed
> miserably though ... anyone out there good at this? ;
I thought Josh was suggesting leaning on the SpamAssassin toolset here,
you're certainly not going to write this yourself in any reasonable amount
of time. The two rules you can use are:
CHARSET_FARAWAY Character set indicates a foreign language
UNWANTED_LANGUAGE_BODY Message written in an undesired language
which both default to a relatively high score (around +3 points on the rev
I just checked). The languages you're willing to accept goes into
ok_languages, http://email.about.com/cs/spamassassintips/qt/et032504.htm
has a reasonable primer here. That defaults to "all".
Since tripping that rule alone isn't enough to pass a typical threshold,
legit messages from people that just happen to have foreign stuff in their
signature and such should typically survive. You might start by setting
ok_languages and reducing the point value for the rules to something small
in order to judge its impact, before using the higher default score.
--
* Greg Smith gsmith(at)gregsmith(dot)com http://www.gregsmith.com Baltimore, MD
From | Date | Subject | |
---|---|---|---|
Next Message | Joshua D. Drake | 2009-06-15 23:11:24 | Re: List moderation - need a break! |
Previous Message | Alvaro Herrera | 2009-06-15 22:50:21 | Re: PUG for Ecuador |