Re: Optimizer on sort aggregate

From: Greg Stark <stark(at)mit(dot)edu>
To: Peter Geoghegan <pg(at)heroku(dot)com>
Cc: Feng Tian <ftian(at)vitessedata(dot)com>, David Rowley <dgrowleyml(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Optimizer on sort aggregate
Date: 2014-10-18 12:27:22
Message-ID: CAM-w4HOSrEvJdvsx=rLL+9Sc2eBPvUKm_rgaE+zAdKHnsqkd4A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sat, Oct 18, 2014 at 3:10 AM, Peter Geoghegan <pg(at)heroku(dot)com> wrote:
> So the first eight bytes of the first string is 0x131F1F1B2222221E,
> and the second 0x131F1F1B2222220C. The last byte is different.

That's interesting but I think it's mostly a quirk of your example.
Afaics the difference is only that the en_US locale ignores
punctuation like : and / (or at least treats them as less significant
than alphabetic characters). If you had strings that had less
punctuation or differences that didn't happen to arrive shortly after
the 8-byte boundary then it wouldn't make any difference.

And we still have to run strfrm at least once, write out the whole
binary blob to memory somewhere and if it spills to disk we still have
to write and read much more data. I think recognizing cases where
equality is the only thing we're interested in and locale-sensitive
sorting isn't necessary and using a memcmp would be a clear win.

I'm not immediately clear on what the cleanest way to integrate it
would be. A btree opclass function like the cmp function but that
doesn't need to be consistent with < and >, only = ? Or perhaps a flag
on the btree opclass that indicates that the data types can safely be
compared with memcmp when equality is all that's needed? The latter is
pretty tempting since it would tell code something interesting about
the data type's internal storage that may lead to other optimizations.
On the other hand the former is nice in that the operator could maybe
handle other cases like padding by doing memcmp on only the
significant bits.

--
greg

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Nicolas Barbier 2014-10-18 13:06:16 Re: Materialized views don't show up in information_schema
Previous Message Simon Riggs 2014-10-18 09:04:35 Re: pgaudit - an auditing extension for PostgreSQL