[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [leafnode-list] Fetchnews: Falling back and regrouping.



Michael O'Quinn schrieb am Mittwoch, den 27. März 2002:

> I just added an 80 gig drive for my news spool.  Unfortunately, this box's

Hopefully that's not an IBM 120GXP series drive (IC35LxxxAVVAxxx) --
they are rated 333 hours per month only (a 30-day month has 720 hours)

> motherboard has a legacy BIOS, so I've been fighting lockups and other
> disk size issues.  I think I've finally got it whipped, although I now
> know _far_ more then I've ever wanted to know about GRUB, the GRand
> Unified Bootloader that RedHat is using instead of good old LILO.

I like GRUB because I can recover from configuration problems much more
easily than with LILO, because I have a shell.

> Also, my incoming mail server was periodically down because of this, so I
> may have missed something in this thread.  (Yes, I am currently running
> mail and news on the same server.  Foolish, I know...)

Not foolish at all.

> > > (2) When fetchnews manages to complete it's run, but for some
> > > reason skips 10,000 or so articles in one group (that I've
> > > confirmed ARE there), the SERVERNAME file is not updated.
> > 
> > I have yet to see why, because on my machine, this does not show.
> 
> NewsPlex may be suspect here -- the logs seem to be inconclusive, at
> least to me.  I've recently posted the output from a recent run.
> Perhaps that will shed some light.

It did not yet help me, maybe a log of a recent version (1.9.20 release)
helps us further.

> Why?
> 
> I'd think you could just trap for the SIG, and get on with saving the
> current state before exiting.

The state info is generated on-the-fly, and the current architecture
makes it very hard to trap exceptions like this one.

> > In this case, however, the SERVERNAME~ file should be there and
> > contain some useful information, can you verify that on your
> > machine?
> 
> When I press <CTRL-C> the old one is not deleted, updated, nor touched
> in any way.

Then that's ok, that's the first half. The other half is to come in
1.9.21.

> Which is exactly the behavior I would expect and want.  If I don't
> want all 28,000 articles I will set maxfetch to something lower.  Is
> that not what it's for?  This is, as well as I can tell, undocumented
> behavior.  If you REALLY think it's necessary to automagically delete
> state info from the SERVERNAME file, make it configurable so such bad
> behavior can be turned off.

It's omitted by the current implementation, because the state is
generated on the fly as interesting.groups is iterated over. When should
leafnode expire information for unwanted groups from SERVERNAME? We
don't want it to fill up with information for groups which someone hit
accidentally.

> BTW, by "artlimit" do you mean "maxfetch"?

Euhm, yes. The config parameter name and the name of the variable in the
C code do not match.

> > It'd be useful to set maxage from the expire time automatically, and
> > I can envision that feature for 1.9.21.
> 
> Huh?

The plan is: "don't even fetch articles you must expire right away." So,
if groupexpire for a particular group is 7 days, implicitly assume
maxage = 7 days.

> How about this:
> 
> Create the SERVERNAME~ file with a lot of extra zeros, like INN, and
> then update it in place after each article, or each ten, or each
> <Configurable Number>.  This allows you have very frequent updates
> without the major penalty of re-writing the entire file each time.

Too much new code for 1.9, and more than needed to fix this.

> Finally, if the SERVERNAME~ file exists at the beginning if a run,
> assume we crashed and roll in the updated immediately.

That's sufficient, together with writing this file in line-buffered
mode: if we crash, SERVERNAME~ is complete and can be rolled in.

> Optionally, to avoid the complexity of updating the numbers in place,
> just rewrite the entire SERVERNAME~ file, but also roll it into the
> SERVERNAME file after each group.  That way it never grows very big.

Doesn't matter, it's only read once.

> Having thought about this a few days, maybe the easiest way would be
> to every X articles, where x is configurable, write a new line to
> SERVERNAME~ with the current state for the current group.  Then, when
> rolling SERVERNAME~ into SERVERNAME, ignore every entry for each group
> except for the last one.  And, of course, fetchnews should check when
> it starts to see if there are any SERVERNAME~ files to roll over
> (which would mean it crashed last time) and do so immediately, before
> starting to fetch articles.

Not necessary, because XOVER/XHDR will tell us which articles we already
have, without fetching them.

> > Certainly, checkpointing the state would be helpful, but then again,
> > leafnode assumes that you do NOT expire faster than you download. 
> 
> Well, assume is a dirty word in some lexicons.  Seriously, all
> software is used in ways the original creators never imagined.  This
> is actually a reasonable usage pattern for a small personal leaf node
> news server.  Not for a large site with many users, of course.

leafnode does not feed nor expire by size, so that's reasonable
independent of the size.

> > As written above, I can imagine deriving a per-group maxage setting
> > from the (per-group) expire time.
> 
> I don't understand the last sentence?

As written above, we'll never fetch articles that would be expired right
away when texpire was running in parallel with fetchnews. (It cannot be
run, lest my new locking code were wrong ;)

-- 
Matthias Andree

-- 
leafnode-list@xxxxxxxxxxxxxxxxxxxxxxxxxxxx -- mailing list for leafnode
To unsubscribe, send mail with "unsubscribe" in the subject to the list