[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [leafnode-list] Fetchnews: Falling back and regrouping.



On Thu, 28 Mar 2002, Matthias Andree wrote:

> Michael O'Quinn schrieb am Mittwoch, den 27. März 2002:
> 
> > I just added an 80 gig drive for my news spool.  Unfortunately, this box's
> 
> Hopefully that's not an IBM 120GXP series drive (IC35LxxxAVVAxxx) --
> they are rated 333 hours per month only (a 30-day month has 720 hours)

No, it's a Seagate Barracuda ST380021A.  I've heard of tons of problem 
with IBM drives, but I have several and personally I've never had any 
trouble.  Just lucky I suppose.

> 
> I like GRUB because I can recover from configuration problems much more
> easily than with LILO, because I have a shell.

Well yeah, I like it too.  It's a much cleaner architecture than LILO, and
it's really a lot easier to use.  I just didn't happen to want to learn
that much about it at that point in time.  Particularly not with my
incoming mail server down...

> 
> > Also, my incoming mail server was periodically down because of this, so I
> > may have missed something in this thread.  (Yes, I am currently running
> > mail and news on the same server.  Foolish, I know...)
> 
> Not foolish at all.

IMHO it is from a security and reliability standpoint.  What just happened
to me is a case in point.  The mail server needs to be 100% available or
mail will bounce.  (Unless a backup mail forwarder is in place, which I
don't have.)  

When things get busy, the last thing you want is Usenet news affecting
your mail system's performance.

Also, both mail (sendmail in particular) and news (INN) have historically
had many security issues, some big enough to drive a truck through.

This has more to do with a philosophy of robustness and reliability than
with some strict and absolute rule, and of course YMMV.

Now back to our regularly scheduled programming...

> 
> > > > (2) When fetchnews manages to complete it's run, but for some
> > > > reason skips 10,000 or so articles in one group (that I've
> > > > confirmed ARE there), the SERVERNAME file is not updated.
> > > 
> > > I have yet to see why, because on my machine, this does not show.
> > 
> > NewsPlex may be suspect here -- the logs seem to be inconclusive, at
> > least to me.  I've recently posted the output from a recent run.
> > Perhaps that will shed some light.
> 
> It did not yet help me, maybe a log of a recent version (1.9.20 release)
> helps us further.

When I get time...

Real Soon Now.

Honest!

(Sort of...)

> 
> > Why?
> > 
> > I'd think you could just trap for the SIG, and get on with saving the
> > current state before exiting.
> 
> The state info is generated on-the-fly, and the current architecture
> makes it very hard to trap exceptions like this one.

I haven't dug into the code enough to comment on this, nor are my skills
current enough to do so intelligently.  Anyone one else have any
suggestions here?

> 
> > > In this case, however, the SERVERNAME~ file should be there and
> > > contain some useful information, can you verify that on your
> > > machine?
> > 
> > When I press <CTRL-C> the old one is not deleted, updated, nor touched
> > in any way.
> 
> Then that's ok, that's the first half. The other half is to come in
> 1.9.21.

Yea!!

> 
> > Which is exactly the behavior I would expect and want.  If I don't
> > want all 28,000 articles I will set maxfetch to something lower.  Is
> > that not what it's for?  This is, as well as I can tell, undocumented
> > behavior.  If you REALLY think it's necessary to automagically delete
> > state info from the SERVERNAME file, make it configurable so such bad
> > behavior can be turned off.
> 
> It's omitted by the current implementation, because the state is
> generated on the fly as interesting.groups is iterated over. When should
> leafnode expire information for unwanted groups from SERVERNAME? We
> don't want it to fill up with information for groups which someone hit
> accidentally.

Why not?  It's not THAT much of a performance hit to maintain them
forever.  You could add a command line switch to fetchnews to to clean out
the cruft, that way the user could have it both ways.

Or just let the user/administrator do it manually, and add that to the
docs.

> > > It'd be useful to set maxage from the expire time automatically, and
> > > I can envision that feature for 1.9.21.
> > 
> > Huh?
> 
> The plan is: "don't even fetch articles you must expire right away." So,
> if groupexpire for a particular group is 7 days, implicitly assume
> maxage = 7 days.

No, this is completely wrong.  

"Implicitly assume" is a euphemism for "Undocumented Behavior."

If the user wants maxage, let them set maxage.  If people are requesting 
the ability to set it on a per-group basis, then do that.  But please 
don't set one parameter based upon another just because you think it makes 
sense to solve one problem.  It will almost always generate 10 more.

> 
> > How about this:
> > 
> > Create the SERVERNAME~ file with a lot of extra zeros, like INN, and
> > then update it in place after each article, or each ten, or each
> > <Configurable Number>.  This allows you have very frequent updates
> > without the major penalty of re-writing the entire file each time.
> 
> Too much new code for 1.9, and more than needed to fix this.

I agree.

> 
> > Finally, if the SERVERNAME~ file exists at the beginning if a run,
> > assume we crashed and roll in the updated immediately.
> 
> That's sufficient, together with writing this file in line-buffered
> mode: if we crash, SERVERNAME~ is complete and can be rolled in.

This sounds like an excellent improvement!

Will SERVERNAME include the group currently being fetched at the time of 
the crash?

> 
> > Having thought about this a few days, maybe the easiest way would be
> > to every X articles, where x is configurable, write a new line to
> > SERVERNAME~ with the current state for the current group.  Then, when
> > rolling SERVERNAME~ into SERVERNAME, ignore every entry for each group
> > except for the last one.  And, of course, fetchnews should check when
> > it starts to see if there are any SERVERNAME~ files to roll over
> > (which would mean it crashed last time) and do so immediately, before
> > starting to fetch articles.
> 
> Not necessary, because XOVER/XHDR will tell us which articles we already
> have, without fetching them.

Well, the whole point here is that I was having to expire articles before
they expired from the downstream server.  Since the message-id's are no
longer available on the leafnode end of things, what you just said breaks.

OTOH, the changes mentioned above will go a long way toward fixing the 
problem, so this is much less of an issue than it was.  

> 
> > > As written above, I can imagine deriving a per-group maxage setting
> > > from the (per-group) expire time.
> > 
> > I don't understand the last sentence?
> 
> As written above, we'll never fetch articles that would be expired right
> away when texpire was running in parallel with fetchnews. (It cannot be
> run, lest my new locking code were wrong ;)

Well, I still think it's a VERY bad idea to assume that groupexpire = 
WhatEver should imply maxage = WhatEver.  Just think of the bug reports 
this is likely to generate...

Michael O'Quinn


-- 
leafnode-list@xxxxxxxxxxxxxxxxxxxxxxxxxxxx -- mailing list for leafnode
To unsubscribe, send mail with "unsubscribe" in the subject to the list