ifile-discuss
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Ifile-discuss] Improving classification of spams


From: Jack Bertram
Subject: [Ifile-discuss] Improving classification of spams
Date: Fri, 10 Jan 2003 16:08:24 +0000
User-agent: Mutt/1.4i

Hi all

I use ifile to filter into about 30 different folders and it does a very
good job on nearly all mail.  However, it does a much less good job at
correctly recognising spam email as spam. Now, I'm much happier with
false negatives than false positives, so this isn't too much of a
problem, but it does lead me to wonder why spam email in particular is a
problem.

My hypothesis is simple: my other folders are fairly homogenous, since
they correspond to particular mailing lists, mail from particular people
tending to talk about similar things, etc.  But spam email falls into a
number of different categories: Nigerian spam, porn, etc, yet I put it
in one folder.  Since ifile essentially computes an "average" for each
folder, and compares an incoming email to that average, non-homogenous
folders are harder to match correctly than homogenous ones.

So, I'm asking two questions:

1. Is this hypothesis any good - does anyone else have the same
experience as me, with non-spam categorised correctly but spam not
recognised so well?

2. How many different sorts of spam do I have to distinguish in order to
make spam matching work better?  Will a porn/non-porn distinction work
well, or do I need to use more spam categories in order to get good
matching.  What do other people on this list do?

Cheers,
jack




reply via email to

[Prev in Thread] Current Thread [Next in Thread]