From: | Robert Haas <robertmhaas(at)gmail(dot)com> |
---|---|
To: | Peter Geoghegan <pg(at)heroku(dot)com> |
Cc: | Pg Hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: B-Tree support function number 3 (strxfrm() optimization) |
Date: | 2014-09-11 20:50:24 |
Message-ID: | CA+TgmoaE7EqUzcdJdDnkOpXpTRYbBKiwBa+AWTa2JX0tmO9nQg@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Thu, Sep 11, 2014 at 4:13 PM, Peter Geoghegan <pg(at)heroku(dot)com> wrote:
> On Wed, Sep 10, 2014 at 11:36 AM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>> No, not really. All you have to do is right a little test program to
>> gather the information.
>
> I don't think a little test program is useful - IMV it's too much of a
> simplification to suppose that a strcoll() has a fixed cost, and a
> memcmp() has a fixed cost, and that we can determine algebraically
> that we should (say) proceed or not proceed with the additional
> opportunistic "memcmp() == 0" optimization based solely on that. I'm
> not sure if that's what you meant, but it might have been.
I think I said pretty clearly that it was.
> However, I think it's perfectly fair to consider a case where the
> opportunistic "memcmp() == 0" thing never works out (as opposed to
> mostly not helping, which is what I considered earlier [1]), as long
> as we're sorting real tuples. You mentioned Heikki's test case; it
> seems fair to consider that, but for the non-abbreviated case where
> the additional, *totally* opportunistic "memcmp == 0" optimization
> only applies (so no abbreviated keys), while still having the
> additional optimization be 100% useless. Clearly that test case is
> also just about perfectly pessimal for this case too. (Recall that
> Heikki's test case shows performance on per-sorted input, so there are
> far fewer comparisons than would typically be required anyway - n
> comparisons, or a "bubble sort best case". If I wanted to cheat, I
> could reduce work_mem so that an external tape sort is used, since as
> it happens tapesort doesn't opportunistically check for pre-sorted
> input, but I won't do that. Heikki's case both emphasizes the
> amortized cost of a strxfrm() where we abbreviate, and in this
> instance de-emphasizes the importance of memory latency by having
> access be sequential/predictable.)
>
> The only variation I'm adding here to Heikki's original test case is
> to have a leading int4 attribute that always has a value of 1 -- that
> conveniently removes abbreviation (including strxfrm() overhead) as a
> factor that can influence the outcome, since right now that isn't
> under consideration. So:
>
> create table sorttest (dummy int4, t text);
> insert into sorttest select 1, 'foobarfo' || (g) || repeat('a', 75)
> from generate_series(10000, 30000) g;
>
> Benchmark:
>
> pg(at)hamster:~/tests$ cat heikki-sort.sql
> select * from (select * from sorttest order by dummy, t offset 1000000) f;
>
> pgbench -f heikki-sort.sql -T 100 -n
>
> With optimization enabled
> ====================
> tps = 77.861353 (including connections establishing)
> tps = 77.862260 (excluding connections establishing)
>
> tps = 78.211053 (including connections establishing)
> tps = 78.212016 (excluding connections establishing)
>
> tps = 77.996117 (including connections establishing)
> tps = 77.997069 (excluding connections establishing)
>
> With optimization disabled (len1 == len2 thing is commented out)
> =================================================
>
> tps = 78.719389 (including connections establishing)
> tps = 78.720366 (excluding connections establishing)
>
> tps = 78.764819 (including connections establishing)
> tps = 78.765712 (excluding connections establishing)
>
> tps = 78.472902 (including connections establishing)
> tps = 78.473844 (excluding connections establishing)
>
> So, yes, it looks like I might have just about regressed this case -
> it's hard to be completely sure. However, this is still a very
> unrealistic case, since invariably "len1 == len2" without the
> optimization ever working out, whereas the case that benefits [2] is
> quite representative. As I'm sure you were expecting, I still favor
> pursuing this additional optimization.
Well, I have to agree that doesn't look too bad, but your reluctance
to actually do the microbenchmark worries me. Granted,
macrobenchmarks are what actually matters, but they can be noisy and
there can be other confounding factors.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
From | Date | Subject | |
---|---|---|---|
Next Message | Bruce Momjian | 2014-09-11 20:53:38 | Re: pg_upgrade and epoch |
Previous Message | Peter Geoghegan | 2014-09-11 20:47:43 | Re: Stating the significance of Lehman & Yao in the nbtree README |