From: | Stephen Frost <sfrost(at)snowman(dot)net> |
---|---|
To: | Robert Haas <robertmhaas(at)gmail(dot)com> |
Cc: | Greg Stark <stark(at)mit(dot)edu>, Noah Misch <noah(at)leadboat(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Peter Geoghegan <pg(at)heroku(dot)com>, Thom Brown <thom(at)linux(dot)com>, Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: B-Tree support function number 3 (strxfrm() optimization) |
Date: | 2014-04-07 18:17:42 |
Message-ID: | 20140407181742.GX4582@tamriel.snowman.net |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
* Robert Haas (robertmhaas(at)gmail(dot)com) wrote:
> To throw out one more point that I think is problematic, Peter's
> original email on this thread gives a bunch of examples of strxfrm()
> normalization that all different in the first few bytes - but so do
> the underlying strings. I *think* (but don't have time to check right
> now) that on my MacOS X box, strxfrm() spits out 3 bytes of header
> junk and then 8 bytes per character in the input string - so comparing
> the first 8 bytes of the strxfrm()'d representation would amount to
> comparing part of the first byte. If for any reason the first byte is
> the same (or similar enough) on many of the input strings, then this
> will probably work out to be slower rather than faster. Even if other
> platforms are more space-efficient (and I think at least some of them
> are), I think it's unlikely that this optimization will ever pay off
> for strings that don't differ in the first 8 bytes. And there are
> many cases where that could be true a large percentage of the time
> throughout the input, e.g. YYYY-MM-DD HH:MM:SS timestamps stored as
> text. It seems like that the patch pessimizes those cases, though of
> course there's no way to know without testing.
Portability and performance concerns were exactly what worried me as
well. It was my hope/understanding that this was a clear win which was
vetted by other large projects across multiple platforms. If that's
actually in doubt and it isn't a clear win then I agree that we can't be
trying to squeeze it in at this late date.
> Now it *may well be* that after doing some research and performance
> testing we will conclude that either no commonly-used platforms show
> any regressions or that the regressions that do occur are discountable
> in view of the benefits to more common cases to the benefits. I just
> don't think mid-April is the right time to start those discussions
> with the goal of a 9.4 commit; and I also don't think committing
> without having those discussions is very prudent.
I agree with this in concept- but I'd be willing to spend a bit of time
researching it, given that it's from a well known and respected author
who I trust has done much of this research already.
Thanks,
Stephen
From | Date | Subject | |
---|---|---|---|
Next Message | Andres Freund | 2014-04-07 18:19:35 | Re: B-Tree support function number 3 (strxfrm() optimization) |
Previous Message | Heikki Linnakangas | 2014-04-07 18:16:40 | WAL replay bugs |