I wrote:
>
> * v8 with chunked interface:
> latency average = 555.688 ms
>
> This starts to improve things for me.
>
> * v8 with chunked, and return lower 32 bits of full 64-bit hash:
> latency average = 556.324 ms
>
> This is within the noise level. There doesn't seem to be much downside
> of using a couple cycles for fasthash's 32-bit reduction.
>
> * revert back to master from Dec 4 and then cherry pick a86c61c9ee
> (save last entry of SearchPathCache)
> latency average = 545.747 ms
>
> So chunked incremental hashing gets within ~2% of that, which is nice.
> It seems we should use that when removing strlen, when convenient.
>
> Updated next steps:
> * Investigate whether/how to incorporate final length into the
> calculation when we don't have the length up front.
> * Add some desperately needed explanatory comments.
> * Use this in some existing cases where it makes sense.
> * Get back to GUC hash and dynahash.
For #1 here, I cloned SMHasher and was dismayed at the complete lack
of documentation, but after some poking around, found how to run the
tests, using the 32-bit hash to save time. It turns out that the input
length is important. I've attached two files of results -- "nolen"
means stop using the initial length to tweak the internal seed. As you
can see, there are 8 failures. "pluslen" means I then incorporated the
length within the finalizer. This *does* pass SMHasher, so that's
good. (of course this way can't produce the same hash as when we know
the length up front, but that's not important). The attached shows how
that would work, further whacking around and testing with Jeff's
prototype for the search path cache hash table. I'll work on code
comments and get it polished.