[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [leafnode-list] Fetchnews: Falling back and regrouping.



On Fri, 29 Mar 2002, Matthias Andree wrote:

> Michael O'Quinn schrieb am Donnerstag, den 28. März 2002:
> 
> > > It's omitted by the current implementation, because the state is
> > > generated on the fly as interesting.groups is iterated over. When should
> > > leafnode expire information for unwanted groups from SERVERNAME? We
> > > don't want it to fill up with information for groups which someone hit
> > > accidentally.
> > 
> > Why not?  It's not THAT much of a performance hit to maintain them
> > forever.  You could add a command line switch to fetchnews to to clean out
> > the cruft, that way the user could have it both ways.
> 
> I still fail to see why the current scheme is a problem ("bug").

Beacuse people harvest binaries, but many don't have a large enough spool
to hold them very long.  Until I got this new hard dive I would select a
group or two, run fetchnews until either the spool filled up or I'd
fetched everything, harvest whatever, then immediately expire the entire
set of groups I had just been working with.  

Then I would do it again, until I had run out of groups.  Sometimes I had 
to go through this cycle several times on just one group.

I also have seome groups I read and expire normally.

With the state information constantly being deleted, I found that 
fetchnews was repeatedly downloading articles I had seen before.

THAT is the bug.

I may be using the program in a way that was not anticipated, but without 
an obcenly large news spool, that was the only way to get the job done.

> 
> > "Implicitly assume" is a euphemism for "Undocumented Behavior."
> 
> Documenting that is not an issue for me.

OK, that's true.

> 
> > If the user wants maxage, let them set maxage.  If people are requesting 
> > the ability to set it on a per-group basis, then do that.  But please 
> > don't set one parameter based upon another just because you think it makes 
> > sense to solve one problem.  It will almost always generate 10 more.
> 
> Can you already see any? What's wrong with not fetching articles we'd
> expire right away?

See my answer above.

Also, just because I don't want to keep articles for (say) three months, I
DO want to see articles that are three months old.  If I've decided to
expire after 14 days, then I want those 14 days to be able to go look at
those articles.

Also, this COMPLETLY breakes the ability to harvest binaries as I
described above.  The only way to make it work would be to go and
fiddle-diddle the numbers in /etc/leafnode/configure EACH TIME before
running fetchnews and then again EACH TIME before running texpire.

Again, this is much less of an issue with a larger spool, but I will STILL 
complain bitterly about not being able to see articles older than 2 weeks 
if I decide to expire after 14 days.

Groupexpire and maxage are simply NOT the same thing.  Please don't try to 
force together like this.

> 
> > Will SERVERNAME include the group currently being fetched at the time of 
> > the crash?
> 
> If we crash while we are fetching from that group, probably not.
> However, depending on the actual implementation I'll choose,
> checkpointing and "pick the last" may come for free, but I'm not
> convinced we will run into this often and checking all articles for a
> particular group will not hurt too much.

It will when harvesting binaries as I described above, when the total
number of articles is greated than will fit in the local spool.  This has
contributed to many duplicate downloads, until I figured out what was
happening and started to update SERVERNAME by hand.

Perhaps something should be put in the docs about this, if it can't be 
fixed completly.

> 
> > Well, the whole point here is that I was having to expire articles before
> > they expired from the downstream server.  Since the message-id's are no
> > longer available on the leafnode end of things, what you just said breaks.
> 
> That's why I want to derive a per-group maxage from the group expire.

I understand that, but artifically restricting what the user can read is 
not the solution.

> Newsreaders that keep a Message-ID history will not fetch articles that
> are older than the oldest entry of their history file either. Say, if
> they keep the Message-IDs for 21 days, they'll reject all articles older
> than that, "maxage = 21" in leafnode speak.

Actually, most newsreaders I've used keep track by the server's article
number, not Message-ID.  And most of them remember forever.  On *nix many
usereaders simply use ~/.newsrc, which just keeps growing and growing and
growing...

This has worked for many years.  Leafnode's SERVERNAME files records a 
subset of this information, and it would be enough IF IT WORKED!!!!

> 
> > Well, I still think it's a VERY bad idea to assume that groupexpire = 
> > WhatEver should imply maxage = WhatEver.  Just think of the bug reports 
> > this is likely to generate...
> 
> People will hardly notice, unless they read the docs. =:-> Seriously,
> though, this behaviour will be documented and at the moment, I start to
> think that I should merge maxage and expire in leafnode, they describe
> the same thing actually "I don't want articles older than N days."

Thet simply are NOT the same thing.

I don't want to see articles older than X days.

I only want to keep those articles for y days.

In other words, I want to see articles that are x days old, but I want 
to see them for at least y days.

As in "I want to check out Charles Dickens' 'A Christmas Carol' (VERY 
large x) from the library, but I'll only keep it for two weeks (relatively 
small y)."

Please eliminate this monsterous bogosity.

Michael O'Quinn





-- 
leafnode-list@xxxxxxxxxxxxxxxxxxxxxxxxxxxx -- mailing list for leafnode
To unsubscribe, send mail with "unsubscribe" in the subject to the list