From: | Sadhuprasad Patro <b(dot)sadhu(at)gmail(dot)com> |
---|---|
To: | Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> |
Cc: | Dilip Kumar <dilipbalaut(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Simon Riggs <simon(dot)riggs(at)enterprisedb(dot)com>, PostgreSQL Developers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Subject: | Re: Next Steps with Hash Indexes |
Date: | 2021-09-01 10:40:20 |
Message-ID: | CAFF0-CEB1U27wmStSsexzsgf=X0RRHNjNA-PmUSdQxcSQhqcwQ@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
>
> That's a significant difference. Have you checked via perf or some
> other way what causes this difference? I have seen that sometimes
> single client performance with pgbench is not stable, so can you
> please once check with 4 clients or so and possibly with a larger
> dataset as well.
I have verified manually, without the PGBENCH tool also. I can see a
significant difference for each query fired in both the versions of
patch implemented. We can see as mentioned below, I have run the SAME
query on the SAME dataset on both patches. We have a significant
performance impact with Separate Hash values for multiple key columns.
SingleHash_MultiColumn:
postgres=# create table perftest(a int, b int, c int, d int, e int, f int);
CREATE TABLE
postgres=# insert into perftest values (generate_series(1, 10000000),
generate_series(1, 10000000), generate_series(1, 10000000), 9, 7);
INSERT 0 10000000
postgres=# create index idx on perftest using hash(a, b, c);
CREATE INDEX
postgres=# select * from perftest where a=5999 and b=5999 and c=5999;
a | b | c | d | e | f
------+------+------+---+---+---
5999 | 5999 | 5999 | 9 | 7 |
(1 row)
Time: 2.022 ms
postgres=# select * from perftest where a=597989 and b=597989 and c=597989;
a | b | c | d | e | f
--------+--------+--------+---+---+---
597989 | 597989 | 597989 | 9 | 7 |
(1 row)
Time: 0.867 ms
postgres=# select * from perftest where a=6297989 and b=6297989 and c=6297989;
a | b | c | d | e | f
---------+---------+---------+---+---+---
6297989 | 6297989 | 6297989 | 9 | 7 |
(1 row)
Time: 1.439 ms
postgres=# select * from perftest where a=6290798 and b=6290798 and c=6290798;
a | b | c | d | e | f
---------+---------+---------+---+---+---
6290798 | 6290798 | 6290798 | 9 | 7 |
(1 row)
Time: 1.013 ms
postgres=# select * from perftest where a=6290791 and b=6290791 and c=6290791;
a | b | c | d | e | f
---------+---------+---------+---+---+---
6290791 | 6290791 | 6290791 | 9 | 7 |
(1 row)
Time: 0.903 ms
postgres=# select * from perftest where a=62907 and b=62907 and c=62907;
a | b | c | d | e | f
-------+-------+-------+---+---+---
62907 | 62907 | 62907 | 9 | 7 |
(1 row)
Time: 0.894 ms
SeparateHash_MultiColumn:
postgres=# create table perftest(a int, b int, c int, d int, e int, f int);
CREATE TABLE
postgres=# insert into perftest values (generate_series(1, 10000000),
generate_series(1, 10000000), generate_series(1, 10000000), 9, 7);
INSERT 0 10000000
postgres=# create index idx on perftest using hash(a, b, c);
CREATE INDEX
postgres=# select * from perftest where a=5999 and b=5999 and c=5999;
a | b | c | d | e | f
------+------+------+---+---+---
5999 | 5999 | 5999 | 9 | 7 |
(1 row)
Time: 2.915 ms
postgres=# select * from perftest where a=597989 and b=597989 and c=597989;
a | b | c | d | e | f
--------+--------+--------+---+---+---
597989 | 597989 | 597989 | 9 | 7 |
(1 row)
Time: 1.129 ms
postgres=# select * from perftest where a=6297989 and b=6297989 and c=6297989;
a | b | c | d | e | f
---------+---------+---------+---+---+---
6297989 | 6297989 | 6297989 | 9 | 7 |
(1 row)
Time: 2.454 ms
postgres=# select * from perftest where a=6290798 and b=6290798 and c=6290798;
a | b | c | d | e | f
---------+---------+---------+---+---+---
6290798 | 6290798 | 6290798 | 9 | 7 |
(1 row)
Time: 2.327 ms
postgres=# select * from perftest where a=6290791 and b=6290791 and c=6290791;
a | b | c | d | e | f
---------+---------+---------+---+---+---
6290791 | 6290791 | 6290791 | 9 | 7 |
(1 row)
Time: 1.676 ms
postgres=# select * from perftest where a=62907 and b=62907 and c=62907;
a | b | c | d | e | f
-------+-------+-------+---+---+---
62907 | 62907 | 62907 | 9 | 7 |
(1 row)
Time: 2.614 ms
If I do a test with 4 clients, then there is not much visible
difference. I think this is because of contentions. And here our focus
is single thread & single operation performance.
>
> One more thing to consider is that it seems that the planner requires
> a condition for the first column of an index before considering an
> indexscan plan. See Tom's email [1] in this regard. I think it would
> be better to see what kind of work is involved there if you want to
> explore a single hash value for all columns idea.
>
> [1] - https://www.postgresql.org/message-id/29263.1506483172%40sss.pgh.pa.us
About this point, I will analyze further and update.
Thanks & Regards
SadhuPrasad
EnterpriseDB: http://www.enterprisedb.com
From | Date | Subject | |
---|---|---|---|
Next Message | Daniel Gustafsson | 2021-09-01 10:43:59 | Re: support for MERGE |
Previous Message | Tony Reix | 2021-09-01 10:29:32 | Re: AIX: Symbols are missing in libpq.a |