[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[leafnode-list] Re: Fetchnews messages

Am 19.05.2009, 15:31 Uhr, schrieb Enrico Wiki  

> On Sat, May 16, 2009 at 6:41 PM, Matthias Andree wrote:
> Am 16.05.2009, 12:28 Uhr, schrieb Enrico Wiki :
>>  On Wed, May 13, 2009 at 10:45 PM, Matthias Andree wrote:
>>>  Hi Enrico,
>>>> leafnode isn't made to find magic ways when underlying transports
>>>> protocol or physical links fail. Leafnode relies on the operating  
>>>> system to
>>>> handle TCP properly (most do this today for the common subset that  
>>>> leafnode
>>>> uses); and when connections break, well, that's it.
>>> Of course so. But there are times connections are unstable, come and  
>>> go,
>>> and
>>> perhaps some retrievers try harder than others? I don't know, it's  
>>> just a
>>> wild guess of the reason I could not complete a leafnode job, when the
>>> connection was rough, but could complete retrieval with some readers.
>> Possibly. But hey, you are using Unix or a Unix-like operating system,  
>> so
>> other than with the typical graphical tool that has to integrate all the
>> features including retries, you can add features yourself, see below.
>>  Having said that, I know the real culprit was my connection, like I  
>> said.
>>> Now the connection is fine and leafnode is doing very well.
>> The obvious workaround will be do tell fetchnews to retry, by for  
>> instance,
>> along these lines as a cron entry:
> OK, thanks for the hints.  Actually, more than looking forward to a
> practical result (for the moment being), I am testing leafnode in order  
> to understand what it does and how it does it,  including under stress
> situations, so to speak.  :-)

Fair enough. The intersting bug reports and questions that lead to  
clarifying the manuals originate in such experiments :-)

>> 17 * * * * while ! fetchnews [--options] ; do sleep 300 ; done
>> This will poll hourly at X:17 o'clock, and after failure retry after
>> sleeping for 300 seconds. Watch your logs though...
> That will retry after failure but not after success, right?

Yes, but if it cannot succeed for thatever reasons, it will hammer away on  
the server every 5 mins, and more and more of these shell processes will  
accumulate; and possibly also delay other cron jobs.

>>  Perhaps - but usually a sign that the upstream server's database is
>>>> corrupt/inconsistent, particularly the overview data doesn't match the
>>>> available articles: the XOVER command offers articles that aren't
>>>> available any more.
>>> Ok. I tried both xover and xhdr, same result.
>> They would be unlikely to differ among each others, as both will access  
>> the overview database, which is different from the article database in  
>> many
>> implementations - including leafnode (although leafnode can afford to  
>> fix inconsistencies on the assumption that the overall load is lower,  
>> so you'll usually see consistent data): there are the message.id and  
>> group
>> directories, and there are the .overview files in the group  
>> directories...
> BTW: if the upstream server supports xhdr, would you recommend using it  
> for fetchnews, in terms of speed?

AFAIR (without looking at actual code), we'll have to issue several XHDR  
commands to obtain all necessary information, so XOVER has the advantage  
of using fewer round trips, and for that reason, it's the default.

>>  Well, no, it wasn't aborted.
>>> I tried again (starting from scratch) and had the same results.
>> Have some of the articles been crossposted to several groups? Then  
>> leafnode will have fetched it for one of the groups, and when it's  
>> listed in another, it will not download another copy, because it  
>> already has it.
> Nope, no crossposted articles, in that case. Just
> *"store: duplicate article "
> and in the same number as the "killed" articles.

Euh. I think - again, without looking at code - that something with  
fetchnews OR with your spool is wrong.

For some reason, fetchnews decides inconsistently if it already has  
certain articles in the spool. Does this happen with XHDR or with XOVER or  
with both?

At any rate, PLEASE DO SAVE your current configuration and command lines  
and take notes of what might trigger these store: duplicate article. I  
think there's a bug in fetchnews/store/... somewhere. In the simplest  
case, it's just XHDR responses not being used properly, but that doesn't  
go without closer inspection (which I don't have time for in the next few  

Could you do me a favour and save your current configuration, and if you  
can afford the space, also a tarball/pax archive, or rsync copy of the  
spool with hard links intact, and also command lines that you used (try  
"history" if you've been hacking away in your shell). I may need some of  
this information later for debugging.

Thanks in advance.

Best regards

Matthias Andree
leafnode-list mailing list