From: | Amit Langote <Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp> |
---|---|
To: | "Kato, Sho" <kato-sho(at)jp(dot)fujitsu(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: speeding up planning with partitions |
Date: | 2018-09-03 05:12:18 |
Message-ID: | b624dfc1-3e90-69fe-7c6f-5675ed4a5b1b@lab.ntt.co.jp |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Thank you Kato-san for testing.
On 2018/08/31 19:48, Kato, Sho wrote:
> Hi, Amit
>
> Great!
> With the total number of records being 6400, I benchmarked while increasing the number of partitions from 100 to 6400.
> Applying three all patches, 20% performance improved for 100 partitions.
>
> I have the same number of records for each partition, do you do the same?
I didn't load any data into tables when running the tests, because these
patches are meant for reducing planner latency. More specifically,
they're addressed to fix the current planner behavior that its latency
increases with increasing number of partitions, with focus on the common
case where only a single partition will need to be scanned by a given query.
I'd try to avoid using a benchmark whose results is affected by anything
other than the planning latency. It will be a good idea if you leave the
tables empty and just vary the number of partitions and nothing else.
Also, we're interested in planning latency, so using just SELECT and
UPDATE in your benchmark script will be enough, because their planning
time is affected by the number of partitions. No need to try to measure
the INSERT latency, because its planning latency is not affected by the
number of partitions. Moreover, I'd rather suggest you take out the
INSERT statement from the benchmark for now, because its execution time
does vary unfavorably with increasing number of partitions. Sure, there
are other patches to address that, but it's better not to mix the patches
to avoid confusion.
> Also, in my case, performance was better when not prepare.
Patches in this thread do nothing for the execution, so, there is no need
to compare prepared vs non-prepared. In fact, just measure non-prepared
tps and latency, because we're only interested in planning time here.
> I think these patches do not improve execute case, so we need faster runtime pruning patch[1], right?
We already have run-time pruning code (that is the code in the patch you
linked) committed into the tree in PG 11, so the master tree also has it.
But since we're not interested in execution time, no need to worry about
run-time pruning.
> Details of measurement conditions and results are as follows.
> - base source
> master(@777e6ddf17) + Speeding up Insert v8 patch[1]
>
> - table definition(e.g. 100 partition)
> create table test.accounts(aid serial, abalance int)
> partition by range(aid);
> create table test.accounts_history(id serial, aid int, delta int,
> mtime timestamp without time zone)
> partition by range(aid);
>
> - command option
> pgbench -d testdb -f benchmark.pgbench -T 180 -r -n -M prepare
> pgbench -d testdb -f benchmark.pgbench -T 180 -r -n
>
> -results
> base source no prepared
> part_num | tps_ex | update_latency | select_latency | insert_latency
> ----------+------------+----------------+----------------+----------------
> 100 | 662.414805 | 0.357 | 0.265 | 0.421
> 200 | 494.478431 | 0.439 | 0.349 | 0.579
> 400 | 307.982089 | 0.651 | 0.558 | 0.927
> 800 | 191.360676 | 0.979 | 0.876 | 1.548
> 1600 | 75.344947 | 2.253 | 2.003 | 4.301
> 3200 | 30.643902 | 5.716 | 4.955 | 10.118
> 6400 | 16.726056 | 12.512 | 8.582 | 18.054
>
> 0001 no prepared
> part_num | tps_ex | update_latency | select_latency | insert_latency
> ----------+------------+----------------+----------------+----------------
> 100 | 429.816329 | 0.811 | 0.75 | 0.365
> 200 | 275.211531 | 1.333 | 1.248 | 0.501
> 400 | 160.499833 | 2.384 | 2.252 | 0.754
> 800 | 79.387776 | 4.935 | 4.698 | 1.468
> 1600 | 24.787377 | 16.593 | 15.954 | 4.302
> 3200 | 9.846421 | 42.96 | 42.139 | 8.848
> 6400 | 4.919772 | 87.43 | 83.109 | 16.56
Hmm, since 0001 is meant to improve update planning time, it seems odd
that you'd get poorer results compared to base source. But, it seems base
source is actually master + the patch to improve the execution time of
update, so maybe that patch is playing a part here, although I'm not sure
why even that's making this much of a difference.
I suggest that you use un-patched master as base source, that is, leave
out any patches to improve execution time.
[ ... ]
> 0001 + 0002 no prepared
> part_num | tps_ex | update_latency | select_latency | insert_latency
> ----------+------------+----------------+----------------+----------------
> 100 | 682.53091 | 0.388 | 0.35 | 0.35
> 200 | 469.906601 | 0.543 | 0.496 | 0.51
> 400 | 321.915349 | 0.78 | 0.721 | 0.752
> 800 | 201.620975 | 1.246 | 1.156 | 1.236
> 1600 | 94.438204 | 2.612 | 2.335 | 2.745
> 3200 | 38.292922 | 6.657 | 5.579 | 6.808
> 6400 | 21.48462 | 11.989 | 10.104 | 12.601
>
[ ... ]
>
> 0001 + 0002 + 0003 no prepared
> part_num | tps_ex | update_latency | select_latency | insert_latency
> ----------+------------+----------------+----------------+----------------
> 100 | 798.962029 | 0.304 | 0.267 | 0.339
> 200 | 577.893396 | 0.384 | 0.346 | 0.487
> 400 | 426.542177 | 0.472 | 0.435 | 0.717
> 800 | 288.616213 | 0.63 | 0.591 | 1.162
> 1600 | 154.247034 | 1.056 | 0.987 | 2.384
> 3200 | 59.711446 | 2.416 | 2.233 | 6.514
> 6400 | 37.109761 | 3.387 | 3.099 | 11.762
>
[ ... ]
By the way, as you may have noticed, I posted a version 2 of the patches
on this thread. If you apply them, you will be be able to see almost same
TPS for any number of partitions with master + 0001 + 0002 + 0003.
> Although it may not be related to this, when measured with pg11 beta2, somehow the performance was better.
>
> 11beta2 + v1-0001-Speed-up-INSERT-and-UPDATE-on-partitioned-tables.patch[3] prepared
> part_num | tps_ex | update_latency | select_latency | insert_latency
> ----------+-------------+----------------+----------------+----------------
> 100 | 756.07228 | 0.942 | 0.091 | 0.123
I guess if you had applied the latest version of
"Speed-up-INSERT-and-UPDATE-on-partitioned-tables.patch" (which is v8
posted at [1]) on top of master + 0001 + 0002 + 0003, you'd get better
performance too. But, as mentioned above, we're interested in measuring
planning latency, not execution latency, so we should leave out any
patches that are meant toward improving execution latency to avoid confusion.
Thanks again.
Regards,
Amit
From | Date | Subject | |
---|---|---|---|
Next Message | Dilip Kumar | 2018-09-03 05:56:30 | Re: Undo logs |
Previous Message | Tatsuo Ishii | 2018-09-03 05:02:57 | Re: Adding a note to protocol.sgml regarding CopyData |