From: | Robert Haas <robertmhaas(at)gmail(dot)com> |
---|---|
To: | Peter Geoghegan <pg(at)heroku(dot)com> |
Cc: | Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: B-Tree support function number 3 (strxfrm() optimization) |
Date: | 2014-09-12 19:02:27 |
Message-ID: | CA+TgmoaRCYscZunEhydMrWz=ft0B5bU8eLe=ze5rfwnsb5GaGA@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Fri, Sep 12, 2014 at 2:58 PM, Peter Geoghegan <pg(at)heroku(dot)com> wrote:
> On Fri, Sep 12, 2014 at 11:38 AM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>> Based on discussion thus far it seems that there's a possibility that
>> the trade-off may be different for short strings vs. long strings. If
>> the string is small enough to fit in the L1 CPU cache, then it may be
>> that memcmp() followed by strcoll() is not much more expensive than
>> strcoll(). That should be easy to figure out: write a standalone C
>> program that creates a bunch of arbitrary, fairly-short strings, say
>> 32 bytes, in a big array.
>
> While I think that's fair, the reason I didn't bother playing tricks
> with only doing a (purely) opportunistic memcmp() when the string size
> is under (say) CACHE_LINE_SIZE bytes is that in order for it to matter
> you'd have to have a use case where the first CACHE_LINE_SIZE of bytes
> matched, and the string just happened to be identical in length, but
> also ultimately differed at least a good fraction of the time. That
> seems like the kind of thing that it's okay to care less about. That
> might have been regressed worse than what you've seen already. It's
> narrow in a whole new dimension, though. The intersection of that
> issue, and the issues exercised by Heikki's existing test case must be
> exceedingly rare.
>
> I'm still confused about whether or not we're talking at cross
> purposes here, Robert. Are you happy to consider this as a separate
> and additional question to the question of what to do in an
> abbreviated comparison tie-break?
I think I've said a few times now that I really want to get this
additional data before forming an opinion. As a certain Mr. Doyle
writes, "It is a capital mistake to theorize before one has data.
Insensibly one begins to twist facts to suit theories, instead of
theories to suit facts." I can't say it any better than that.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
From | Date | Subject | |
---|---|---|---|
Next Message | Peter Geoghegan | 2014-09-12 19:08:58 | Re: B-Tree support function number 3 (strxfrm() optimization) |
Previous Message | Peter Geoghegan | 2014-09-12 18:58:46 | Re: B-Tree support function number 3 (strxfrm() optimization) |