[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [leafnode-list] 2.0b8_ma3 leafnode very slow
Stefan Wiens <s.wi@xxxxxxx> writes:
> Libc asks isatty() to find the appropriate buffering mode.
Might be, GNU libc 2.1.3 and FreeBSD 4.3 libc (I did not check others)
are documented like this, however, setting a 16 kB buffer helped speed
things up a little here (GNU libc), for whatever reason, I don't know.
> When connected via inetd(8), it chooses full buffering (4096 chars).
> Maybe some inetd replacement makes stdout look like a tty?
My xinetd doesn't, isatty(0) -> 0.
It seems relying on libc to set fully-buffered mode is not a good idea.
> But in most cases it's completely useless. When parsing .overview, one
> could at no cost check the lines for ascending order and skip
> sorting. Besides that, when adding new lines, some kind of insertion
> sort is more efficient than qsort(). I am currently testing a separate
> program for updating .overview files (after fetchnews). It takes the
> same time as fixxover(), but with only 5% of its memory usage.
What is the highest memory usage you see in fixxover that makes you
optimize this function which is sort of a one-shot use, runs for a
couple of seconds?
> The same considerations apply for active information. Because
> groupinfo is already sorted (normally), qsort() is overkill.
> With 20,000 groups, qsort() needs more than 200,000 strcmp() calls
> where 20,000 would be sufficient.
My current version calls mergegroups less often than _ma4, which helps
already. I believe the readlocalgroups() in rereadactive() is the
culprit here. I'll see what I can do to cut the qsort invocations down.
I'd rather not see what happens when a newly-fetched active is sorted
with a "weaker" sort. You cannot rely on your servers sending sorted
lists. (t-online, for instance, doesn't).
It might be worthwhile looking at a natural mergesort to cut the best
(and common) case to O(n), taking advantage of "almost sorted"
condition, and in the same run restrict the worst case to O(n log n) --
quick sort has O(n^2) worst-case. Not all implementations use a median
selection to avoid this case.
Mergesort OTOH requires an additional pointer array.
That would certainly upset you :-) (although my newsgroup names usually
take much more space than the pointers to the names).
A heap sort variant (bottom-up, weak-heap sort) might be another
candidate, taking less additional memory than mergesort but is less
simple than mergesort is. If you find a BSD or MIT-licensed variant of
weak-heapsort, best used with qsort-style arguments, give it a shot and
report back.
Finally, there's radixsort, but that only suits for replacing qsort
where it calls upon strcmp or strcasecmp.
--
Matthias Andree
--
leafnode-list@xxxxxxxxxxxxxxxxxxxxxxxxxxxx -- mailing list for leafnode
To unsubscribe, send mail with "unsubscribe" in the subject to the list