[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[leafnode-list] Re: better hash for message.id/ ?

On Wed, 5 Nov 2008 01:39:03 +0100 Matthias Andree wrote:

> clemens fischer schrieb am 2008-11-04:
> This is known to be suboptimal, but feel free to change for
> leafnode-2. If you want to give it a show while waiting for the next
> precinct or state results, change the code, install and run texpire -r
> (-r for repair), it is supposed to fix wrong hashes (but best to keep
> a backup of /var/spool/news anyways).
> If it results in a better distribution (and perhaps if it maintains
> that property with r = r % 1000 rather than r = r % 999 - or would a
> prime be better here?), send me a patch, I'm open to change this.

I'm not sure about a prime here, unless it is built into the hash
function as the number of buckets.  It should be prime, but it doesn't
have to be.

>> There's a simple hash function by Weinberger, see Aho, Sethi, Ullman
>> book on compilers from 1986 or search for hashpjw.  It basically goes
> What's the license on that, or did you derive the code yourself?

I googled for the word "hashpjw".  There are all sorts of pages
mentioning it.  All the projects cite the source, but modify and use the
code as needed.  To me, this is either public domain or BSD.

> On the other hand, there is already an RC4-based PRNG, I wonder if
> that can be recycled for hashing and might even yield better results.

What do you mean by "recycled"?  AFAICS arc4 is in the tree, but used
nowhere!  If it were, nothing could be easier: insert initialization
and "stirring" (ie. reseeding) at the proper locations, then use
arc4_getword() to get a 32-bit quantity and scale it down to 999.

As it stands it might be better to replace the module with a simpler one
not needing that much care in terms if init and reseed.


leafnode-list mailing list