From: | Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | Robert Haas <robertmhaas(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: pgsql: Add parallel-aware hash joins. |
Date: | 2018-01-24 01:41:21 |
Message-ID: | CAEepm=1c7SpPfsFtFS0gJaPYXwLu=BpCi8WX3Ws7DsuVh1KzkQ@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-committers pgsql-hackers |
On Wed, Jan 24, 2018 at 12:10 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> There is a very clear secular trend up in the longer data series,
> which indicates that we're testing more stuff,
+1
> which doesn't bother
> me in itself as long as the time is well spent. However, the trend
> over the last two months is very bad, and I do not think that we can
> point to any large improvement in test coverage that someone committed
> since November.
I'm not sure if coverage.postgresql.org has a way to view historical
reports so we can see the actual percentage change, but as I recall
it, commit fa330f9ad "Add some regression tests that exercise hash
join code." pushed nodeHash.c and possibly nodeHashJoin.c into green
territory on here:
https://coverage.postgresql.org/src/backend/executor/index.html
> Looking more closely at the shorter series, there are four pretty obvious
> step changes since 2016-09. The PNG's x-axis doesn't have enough
> resolution to match these up to commits, but looking at the underlying
> data, they clearly correspond to:
>
> Branch: master Release: REL_10_BR [b801e1200] 2016-10-18 15:57:58 -0400
> Improve regression test coverage for hash indexes.
>
> Branch: master Release: REL_10_BR [4a8bc39b0] 2017-04-12 16:17:53 -0400
> Speed up hash_index regression test.
>
> Branch: master [fa330f9ad] 2017-11-29 16:06:50 -0800
> Add some regression tests that exercise hash join code.
Joining check runtimes with the commit log (see attached), I see:
2017-11-30 | fa330f9a | Add some regression tests that exercise | 00:08:30
2017-11-29 | 84940644 | New C function: bms_add_range | 00:08:18
That's +2.4%.
> Branch: master [180428404] 2017-12-21 00:43:41 -0800
> Add parallel-aware hash joins.
2017-12-21 | cce1ecfc | Adjust assertion in GetCurrentCommandId. | 00:09:03
2017-12-21 | 6719b238 | Rearrange execution of PARAM_EXTERN Para |
2017-12-21 | 8a0596cb | Get rid of copy_partition_key |
2017-12-21 | 9ef6aba1 | Fix typo |
2017-12-21 | c98c35cd | Avoid putting build-location-dependent s |
2017-12-21 | 59d1e2b9 | Cancel CV sleep during subtransaction ab |
2017-12-21 | 18042840 | Add parallel-aware hash joins. |
2017-12-20 | f94eec49 | When passing query strings to workers, p | 00:08:45
That's +3.4%. That's a bit more than I expected. I saw 2.5% on my
development box and hoped that'd be OK for a complex feature with a
lot of paths to test.
But hang on a minute -- how did we get to 08:45 from 08:30 between
those commits? Of course this is all noisy data and individual
samples are all over the place, but I think I see some signal here:
2017-12-20 | f94eec49 | When passing query strings to workers, p | 00:08:45
2017-12-19 | 7d3583ad | Test instrumentation of Hash nodes with | 00:08:43
2017-12-19 | 8526bcb2 | Try again to fix accumulation of paralle |
2017-12-19 | 38fc5470 | Re-fix wrong costing of Sort under Gathe | 00:08:31
2017-12-19 | 09a65f5a | Mark a few parallelism-related variables | 00:08:27
Both 8526bcb2 and 7d3583ad add Gather rescans. Doesn't 8526bcb2 add a
query that forks 40 times? I wonder how long it takes to start a
background worker on a Mac Cube. From that commit:
+ -> Gather (actual rows=9800 loops=10)
+ Workers Planned: 4
+ Workers Launched: 4
+ -> Parallel Seq Scan on tenk1 (actual rows=1960 loops=50)
> I thought that the hash index test case was excessively expensive for
> what it covered, and I'm now thinking the same about hash joins.
Does join-test-shrink.patch (from my earlier message) help much, on
prairiedog? It cut "check" time by ~1.5% on my low end machine.
--
Thomas Munro
http://www.enterprisedb.com
Attachment | Content-Type | Size |
---|---|---|
prairiedog-check-speed-vs-commit-log.txt | text/plain | 31.8 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Amit Kapila | 2018-01-24 02:56:45 | Re: pgsql: Allow UPDATE to move rows between partitions. |
Previous Message | Bruce Momjian | 2018-01-23 23:23:07 | pgsql: doc: mention psql -l uses the 'postgres' database by default |
From | Date | Subject | |
---|---|---|---|
Next Message | Masahiko Sawada | 2018-01-24 01:44:26 | Re: Failed to request an autovacuum work-item in silence |
Previous Message | Masahiko Sawada | 2018-01-24 01:39:32 | Re: Remove utils/dsa.h from autovacuum.c |