ifile-discuss
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Ifile-discuss] Re: html tag stripping


From: clemens fischer
Subject: [Ifile-discuss] Re: html tag stripping
Date: 25 Jun 2003 22:48:17 +0200
User-agent: Gnus/5.1003 (Gnus v5.10.3) Emacs/21.3 (berkeley-unix)

* David Bushong:

> Yo<kc34sma21py2>uve rea<khuyowp1wuizl>d about them in the 
> P<ks4nj3w258mkq1>apers....
>
> (If you're reading this list in HTML, try turning it off).  Basically, this
> completely ruins ifile's effectiveness.  However a simple addition to the
> word tokenizer to skip anything between matched <>'s would completely avoid
> this problem (as well as stop making "font", "color", etc. my most popular
> words, spam or otherwise).

you've got my vote, because it's simple.  then again, people who use
ifile for something else then spam-filtering may not like it.  i think
all i've seen in ifile development has never deminuished applicability
to text messages, be they meail or usenet, but many have been attempts
to let it un-base64 MIME parts or whatnot.  upto now, this hasn't
happend.

have you thought about testing bogofilter?

  clemens




reply via email to

[Prev in Thread] Current Thread [Next in Thread]