[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [leafnode-list] Differences between Leafnode and Leafnode+

David Given wrote:

> trn/strn won't post to leafnode. I've used trn v3.6 (20 Nov 1994), configured 
> for NNTP. I forget the exact problem, but I think I've posted here about it 
> before.

Hmmmm... I'll look it up. Wasn't it something that trn doesn't use POST
itself but instead tries to call some kind of "inews" program?

> Cornelius Krasel wrote:
> >I also
> >received a patch by Remi Guyomarch which should increase the speed of
> >downloading the active file dramatically (from almost an hour down to
> >about five minutes), although only for the first run.
> Impressive. How does it manage that?

The list of groups is usually stored as a tree which makes searching for
groups quite fast. However, inserting new groups in that tree is very
inefficient if the groups come already sorted (as they do with most
news servers), as the tree then essentially becomes a linear list. The
solution seems to be to grab all the newsgroups, store them in a list
and sort the list locally.

This is what Remi wrote about his patch:

Since most of the time, the NNTP server send us a more or less sorted
active file, repeated calls to insertgroup() from nntpactive() will
create a linked list instead of the intended binary tree. This makes
the process incredibly slow on old hardware. For example, my 486
DX4/100 can't cope with a 28000 lines active file coming at ~7 KB/s
through a 33.6 modem. After just a few seconds, the CPU goes to 100%
and after a minute or so, 'fetch' reads this active file at a rate of
~900 bytes per second.
At such a rate, I think reading the full active file and all
descriptions will probably take 1 hour and perhaps more.

Doing a gprof session on 'fetch' showed that insertgroup() is
responsible for 99.9% of this slowdown.

With my patch :
- nntpactive() reads the active file and store each line in a
  linked-list buffer.
- It sorts this active file with a qsort()
- It insert those lines using insertgroup() with a recursive function
  which act like this :
        * it inserts the group in the middle of the current block
        * it calls itself with the block before this middle line
        * it calls itself with the block after this middle line
  I think this function makes the binary tree perfectly
  balanced. Inserting 28000 lines with this function require ~1 second
  on my (old) hardware.
- This recursive function also free() each linked-list element after
  the call to insertgroup() [the funny lines "free( array[middle] -
  sizeof(char*) )"]

With this (hacky) patch I need only ~6 minutes to retrieve the
full active file and descriptions from my newserver, instead of the
expected 1 hour with the un-pached 'fetch'.

I did some benchmarking myself, and it appears that, on my machine
(a lousy 486 DX 2/66) the current algorithm needs 1 min 10 sec for
inserting 5000 lines from the active file in an empty tree while
the new algorithm needs 3 sec. The more groups you have, the greater
will be the difference.


/* Cornelius Krasel, U Wuerzburg, Dept. of Pharmacology, Versbacher Str. 9 */
/* D-97078 Wuerzburg, Germany   email: phak004@xxxxxxxxxxxxxxxxxxxxxx  SP4 */
/* "Science is the game we play with God to find out what His rules are."  */

leafnode-list@xxxxxxxxxxxxxxxxxxxxxxxxxxxx -- mailing list for leafnode
To unsubscribe, send mail with "unsubscribe" in the subject to the list