Re: New design for FK-based join selectivity estimation

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: New design for FK-based join selectivity estimation
Date: 2016-06-20 16:39:11
Message-ID: 28279.1466440751@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com> writes:
> OK, thanks. The one thing we haven't done is testing the performance, to
> see how this fares. So I've repeated the tests I've done on the original
> version of the patch here

Hmm. I'm not that excited about these results, for a couple of reasons:

* AFAICS, all the numbers are collected from the first execution of a
query within a session, meaning caches aren't populated and everything
has to be loaded from disk (or at least shared buffers).

* I do not credit hundreds of completely redundant FKs between the same
two tables as being representative of plausible real-world cases.

I modified your new script as attached to get rid of the first problem.
Comparing HEAD with HEAD minus commit 100340e2d, in non-assert builds,
I get results like this for the 100-foreign-key case (with repeat
count 1000 for the data collection script):

select code, test, avg(time),stddev(time) from data group by 1,2 order by 1,2;
code | test | avg | stddev
--------+-------+--------------------+---------------------
head | t1/t2 | 0.065045045045045 | 0.00312962651081508
head | t3/t4 | 0.168561561561562 | 0.00379087132124092
head | t5/t6 | 0.127671671671672 | 0.00326275949269809
head | t7/t8 | 0.391057057057056 | 0.00590249325300915
revert | t1/t2 | 0.0613933933933937 | 0.0032082678131875
revert | t3/t4 | 0.0737507507507501 | 0.00221692725859567
revert | t5/t6 | 0.123759759759759 | 0.00431225386651805
revert | t7/t8 | 0.154082082082081 | 0.00405118420422266
(8 rows)

So for the somewhat-credible cases, ie 100 unrelated foreign keys,
I get about 3% - 6% slowdown. The 100-duplicate-foreign-keys case
does indeed look like about a 2X slowdown, but as I said, I do not
think that has anything to do with interesting usage.

In any case, the situation I was worried about making better was
queries joining many tables, which none of this exercises at all.

regards, tom lane

Attachment Content-Type Size
fk-perf-run-2.sh text/plain 695 bytes

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2016-06-20 16:43:18 Re: 10.0
Previous Message David G. Johnston 2016-06-20 16:38:14 Re: 10.0