From: | Konstantin Knizhnik <k(dot)knizhnik(at)postgrespro(dot)ru> |
---|---|
To: | Andres Freund <andres(at)anarazel(dot)de> |
Cc: | Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>, Antonin Houska <ah(at)cybertec(dot)at>, pgsql-hackers(at)lists(dot)postgresql(dot)org |
Subject: | Re: [HACKERS] Secondary index access optimizations |
Date: | 2018-03-22 09:38:33 |
Message-ID: | 9d3154bd-5e2d-0e0a-4975-ec76451bae90@postgrespro.ru |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On 21.03.2018 20:30, Konstantin Knizhnik wrote:
>
>
> On 01.03.2018 23:15, Andres Freund wrote:
>> Hi,
>>
>>
>> This patch seems like quite a good improvement. One thing I've not
>> really looked at but am slightly concerned in passing: Are there cases
>> where we now would do a lot of predicate pruning work even though the
>> overhead of just evaluating the qual is trivial, e.g. because there's
>> only one row due to a pkey? The predtest code is many things but
>> lightning fast is not one of them. Obviously that won't matter for
>> analytics queries, but I could see it hurt in OLTPish workloads...
>>
>> Greetings,
>>
>> Andres Freund
>
> Hi,
> I am sorry for delay with answer.
>
> I understand your concern and did the following experiment.
> I have initialized the database in the following way:
>
> create table base (k integer primary key, v integer);
> create table part1 (check (k between 1 and 10000)) inherits (base);
> create table part2 (check (k between 10001 and 20000)) inherits (base);
> create index pi1 on part1(v);
> create index pi2 on part2(v);
> insert into part1 values (generate series(1,10000), random()*100000);
> insert into part2 values (generate_series(10001,20000), random()*100000);
> vacuum analyze part1;
> vacuum analyze part2;
>
> Vanilla Postgres uses the following plan:
>
> explain select * from base where k between 1 and 20000 and v = 100;
> QUERY PLAN
> -----------------------------------------------------------------------
> Append (cost=0.00..16.63 rows=3 width=8)
> -> Seq Scan on base (cost=0.00..0.00 rows=1 width=8)
> Filter: ((k >= 1) AND (k <= 20000) AND (v = 100))
> -> Index Scan using pi1 on part1 (cost=0.29..8.31 rows=1 width=8)
> Index Cond: (v = 100)
> Filter: ((k >= 1) AND (k <= 20000))
> -> Index Scan using pi2 on part2 (cost=0.29..8.31 rows=1 width=8)
> Index Cond: (v = 100)
> Filter: ((k >= 1) AND (k <= 20000))
> (9 rows)
>
> Execution of this query 100000 times gives is done in 12 seconds.
> With applied patch query plan is changed to:
>
> QUERY PLAN
> -----------------------------------------------------------------------
> Append (cost=0.00..16.62 rows=3 width=8)
> -> Seq Scan on base (cost=0.00..0.00 rows=1 width=8)
> Filter: ((k >= 1) AND (k <= 20000) AND (v = 100))
> -> Index Scan using pi1 on part1 (cost=0.29..8.30 rows=1 width=8)
> Index Cond: (v = 100)
> -> Index Scan using pi2 on part2 (cost=0.29..8.30 rows=1 width=8)
> Index Cond: (v = 100)
> (7 rows)
>
> Elapsed time of 100000 query executions is 13 seconds.
> So you was right that increased query optimization time exceeds
> advantage of extra checks elimination.
> But it is true only if we are not using prepare statements.
> With prepared statements results are the following:
>
> Vanilla: 0m3.915s
> This patch: 0m3.563s
>
> So improvement is not so large, but it exists.
> If somebody wants to repeat my experiments, I attached to this mail
> small shell script which I used to run query several times.
> Non-prepared query is launched using the following command:
>
> time ./repeat-query.sh "select * from base where k between 1 and 20000
> and v = 100" 100000
>
> and prepared query:
>
> time ./repeat-query.sh "execute select100" 100000 "prepare select100
> as select * from base where k between 1 and 20000 and v = 100"
>
> And once again I want to notice that using prepared statements can
> increase performance almost 3 times!
> As far as using prepared statements is not always possible I want to
> recall autoprepare patch waiting at the commitfest:
>
> https://www.postgresql.org/message-id/flat/8eed9c23-19ba-5404-7a9e-0584b836b3f3%40postgrespro(dot)ru#8eed9c23-19ba-5404-7a9e-0584b836b3f3(at)postgrespro(dot)ru
>
> --
> Konstantin Knizhnik
> Postgres Professional:http://www.postgrespro.com
> The Russian Postgres Company
Attached please find rebased version of the patch.
Also I do more testing, now using pgbench.
Scripts for initialization of the database and for custom script for
pgbench are attached.
Results at my computer are the following:
pgbench options
Vanilla
Patch
-c 1
9208
8289
-c 1 -M prepared
38503 41206
-c 10
39224
34040
-c 10 -M prepared
165465
172874
--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company
Attachment | Content-Type | Size |
---|---|---|
optimizer-11.patch | text/x-patch | 49.8 KB |
select.pgbench | text/plain | 83 bytes |
init.sql | application/sql | 440 bytes |
From | Date | Subject | |
---|---|---|---|
Next Message | Dean Rasheed | 2018-03-22 09:56:11 | Re: MCV lists for highly skewed distributions |
Previous Message | Fabien COELHO | 2018-03-22 09:32:13 | Re: pg_stat_statements HLD for futur developments |