[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [leafnode-list] New or enhanced filtering in leafnode-2.0?



Lloyd Zusman wrote:

[...]

There is a new filtering mechanism in the current alpha versions of
leafnode-2.0. It can work with your examples:

>     Example:   in the newsgroup  `news.software.readers', I
>                *only* want to download articles that have
>                the string "slrn" (upper or lower case) in the
>                Subject: line.

newsgroup = news.software.readers
pattern = ^Subject:.*[sS][lL][rR][nN]
action = select

>     Example:   in all `alt.binaries.*' newsgroups, *only* download
>                articles which have the following pattern in
>                their Subject: lines:     \.(jpe?g|gif)
>                [upper or lower case]
> 
>                in all other newsgroups, *never* download articles
>                which have the following pattern in their Subject:
>                lines:   \.(jpe?g|gif)  [upper or lower case]

newsgroup = alt.binaries.*
pattern = ^Subject:.*\.[jJ][pP][eE]
action = select
pattern = ^Subject:.*\.[gG][iI][fF]
action = select

newsgroup = *
pattern = ^Subject:.*\.[jJ][pP][eE]
action = kill
pattern = ^Subject:.*\.[gG][iI][fF]
action = kill

(This works because patterns are worked on from top-to-down.)

You can also score articles by doing "action = 5" or "action = -10".
Doing "action = 999" is still different from "action = select", because
only in the former case subsequent regexp's will be checked.

The parser as used currently has still some drawbacks, especially that
you cannot use # anywhere (most problematic for the regexp, I guess),
because it is always interpreted as the start of a comment.

> I'd like to fervently request that *some* sort of enhanced filtering
> be put into leafnode-2.0, so that it can handle as a minimum the types
> of cases that I outlined above in my examples.  Keep in mind that I
> believe that this should be *optional*, and that the current filtering
> capabilities should still be available ... this way no one has to
> change anything unless they actually want to use the enhanced
> filtering.

No, I think the original filter file format should be entirely abolished.
Otherwise, the parser gets much too complicated. It was only thrown together
because I needed something quickly to filter the ARSBOMB spam. Change of
version number is a good time to change formats as well, and we can
include a conversion routine into "make update", if we want to.

(The new mechanism in the 2.0alpha versions will simply ignore the old
filter settings, so there is no harm done.)

> Also, over the next week or so I'm going to write and submit another
> proof-of-concept patch for a new filtering mechanism.

Before you are going to write something, I strongly recommend you should
have a look at the new filtering mechanism. Maybe this may save you some
work :-) Anybody who wants me to send a tarball of the latest alpha, just
tell me. While it builds on 1.9.4, I have included all patches that were
made in the 1.9.5-1.9.10 versions. However, it therefore features a
rather impressive changelog :-)

--Cornelius.

-- 
/* Cornelius Krasel, U Wuerzburg, Dept. of Pharmacology, Versbacher Str. 9 */
/* D-97078 Wuerzburg, Germany   email: phak004@xxxxxxxxxxxxxxxxxxxxxx  SP4 */
/* "Science is the game we play with God to find out what His rules are."  */

-- 
leafnode-list@xxxxxxxxxxxxxxxxxxxxxxxxxxxx -- mailing list for leafnode
To unsubscribe, send mail with "unsubscribe" in the subject to the list