From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | Joel Jacobson <joel(at)trustly(dot)com> |
Cc: | John Naylor <john(dot)naylor(at)2ndquadrant(dot)com>, Andrew Dunstan <andrew(dot)dunstan(at)2ndquadrant(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Joerg Sonnenberger <joerg(at)bec(dot)de>, David Rowley <david(dot)rowley(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: reducing the footprint of ScanKeyword (was Re: Large writable variables) |
Date: | 2019-03-20 20:24:25 |
Message-ID: | 18021.1553113465@sss.pgh.pa.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Joel Jacobson <joel(at)trustly(dot)com> writes:
> I've seen a performance trick in other hash functions [1]
> to instead read multiple bytes in each iteration,
> and then handle the remaining bytes after the loop.
> [1] https://github.com/wangyi-fudan/wyhash/blob/master/wyhash.h#L29
I can't get very excited about this, seeing that we're only going to
be hashing short strings. I don't really believe your 30% number
for short strings; and even if I did, there's no evidence that the
hash functions are worth any further optimization in terms of our
overall performance.
Also, as best I can tell, the approach you propose would result
in an endianness dependence, meaning we'd have to have separate
lookup tables for BE and LE machines. That's not a dealbreaker
perhaps, but it is certainly another point on the "it's not worth it"
side of the argument.
regards, tom lane
From | Date | Subject | |
---|---|---|---|
Next Message | Julien Rouhaud | 2019-03-20 20:30:01 | Re: [survey] New "Stable" QueryId based on normalized query text |
Previous Message | legrand legrand | 2019-03-20 20:05:06 | Re: [survey] New "Stable" QueryId based on normalized query text |