[Home]Spam/Filtering

ec2-3-135-183-89.us-east-2.compute.amazonaws.com | ToothyWiki | Spam | RecentChanges | Login | Webcomic

On a related note, why obfuscate drug names? Surely if someone is explicitly filtering the words "Viagra" or "Xanax", one might take it as a slight hint that they don't want any?
(PeterTaylor) Shame that filtering 1337 would probably cause false positives matching e-mails containing code with variable names including numerals.
One thing I noticed with such spams is that they invariably mention these drugs multiple times, and use a different mis-spelling each time. Therefore, if one builds a reasonably comprehensive list of mis-spellings to watch for, there is a really good chance that such a spam will trigger that rule. --Admiral
*Surely* building a regexp is a better idea than a big list of misspellings? Something like (case insensitive) (V|\\/)+[:punct:]*[I1l!|]+[:punct:]*[A@]+[:punct:]*[G69]+[:punct:]*R+[:punct:]*[A@]+ ... --AlexChurchill
Ought to replace each [:punct:]* with ([:punct:][:space:])* in fact.  And one wonders whether Cialis was so named specifically to provide a huge range of filter-dodging misspelling opportunities! --AlexChurchill

The ToothyWikiInternals/MailServer global spam filter supports regexps and now ranks email with more three of the same consecutive vowel in the subject line with a spam score of +3. This should be good enough to distinguish this kind of mail - tell me if it breaks anything. In general, like I say on that page, if there's rules you want adding to it, or rules you're using privately that are particularly useful, please do feel free to mention them to me so that I can add them to the global filter and they can benefit everyone using toothycat.net mail ;) - MoonShadow
Thanks. I hadn't thought of asking to have a rule added to the ToothyCat.net filters. I've created a rule to watch for that header, and will see how it goes.
Uh, careful with just watching for the header - a lot of my legitimate mail gets a single 'x', for instance; you may want to do something like "X-SpamScore? contains xx" or "X-SpamScore? contains xxx".. - MoonShadow
And I'm afraid the rules I get most use out of are "anything that's HTML or multipart and sender not in my addressbook" (specific to my address book), and anything sent to any one of a number of cantab.net email addresses that my address seems to get bracketed with due to lexicographic proximity (when spams get sent to groups of 4-20 adjacent addresses on the spammers' lists, and thus very unlikely to be useful for anyone else). But if I notice any of my other rules catching many, I'll try to remember to let you know. --AlexChurchill
Fair enough - much the same here, actually ^^ - MoonShadow

Further to this: two of my other rules have been consistently catching lots of spam to me (several a day, almost every day), so as requested here are their details.
Anything with any of these words in the subject gets filtered off to my "almost certainly spam" folder, which I review occasionally to check for false positives:
Vicodin; Viagra; $200 bonus; prescri; citrate; pharmac; xanax; medication; RND_UC_CHAR; Alex.churchill; Photos of Singles; GOT PILLS; V1agra
As does anything matching simultaneously one of these strings in the subject:
@cantab.net; Free; $; Health; PC; Mortgage; Interest rate; Sale; You've Won; Get Paid; No Cost; No Shipping; Percent; Penis; Male; education; Visa; loan; lender; ADV; cash; debt; refinance; bills; sale; money; hgh; Real Estate; Deals; surveys; opinion; winner; prescript
and one of these strings in the message body:
Gift-Offers.com; unsub; opted; Jazzy Deals; %RANDOM_TEXT; opt-in; bflv; online surveys; our web site; click here; Viagra; penis; BuyersAdvantage?; pharmacy; life-insurance
A number of those subjects could probably be moved off to the first rule to trigger a filter all on their own; this is just the current state of my filter rules. --AlexChurchill

I'd just like to poke my nose in and say that I've found thunderbirds auto filtering to be pretty good - though it does take a long time (several weeks, which is about 3k spam mails) to train.  Even so, I'm practically moving to a whitelist system anyway.  One thing I would like is a rule that says "I have been sent this exact same subject line from multiple different people - junk it" since that is the most annoying type of spam now, if one of it gets through my filters, all of it does.  --Vitenka
(PeterTaylor) Presumably exempting subjects starting "Re: "?
Nah, any threaded conversations will be whitelisted already.  But then, I barely use email now, I just need it to filter away everything and then flag up 'you need to sift through that for a good one' from time to time.  --Vitenka

ec2-3-135-183-89.us-east-2.compute.amazonaws.com | ToothyWiki | Spam | RecentChanges | Login | Webcomic
This page is read-only | View other revisions | Recently used referrers
Last edited August 2, 2004 12:33 pm (viewing revision 9, which is the newest) (diff)
Search: