Re: Add ExprState hashing for GROUP BY and hashed SubPlans

From: David Rowley <dgrowleyml(at)gmail(dot)com>
To: Andrei Lepikhov <lepihov(at)gmail(dot)com>
Cc: PostgreSQL Developers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Add ExprState hashing for GROUP BY and hashed SubPlans
Date: 2024-12-10 22:36:30
Lists: pgsql-hackers

On Wed, 11 Dec 2024 at 00:44, David Rowley <dgrowleyml(at)gmail(dot)com> wrote:
> I benchmarked a slightly revised version of the v6-0002 (attached as
> v7-0001) with the following hash join case.
> create table t1 as select a/100000 as a from generate_series(0,999999)a;
> create table t2 as select a from generate_series(0,9)a;
> vacuum freeze analyze t1,t2;
> bench.sql:
> select * from t1 inner join t2 on t1.a=t2.a offset 1000000;
> for i in {1..50}; do pg_ctl stop -D pgdata > /dev/null && pg_ctl start
> -D pgdata -l pg.log > /dev/null && psql -c "select
> pg_prewarm('t1'),pg_prewarm('t2');" postgres > /dev/null && sleep 10
> && pgbench -n -f bench.sql -T 10 -M prepared postgres | grep tps; done
> The patch gave me a 10.62% speed-up for an inner hash join on a single
> column using an AMD Zen4 7945HX. The attached graph shows master vs
> v7-0001 patch with the percentage of performance improvement on the
> right axis and the tps on the left vertical axis. Each of the tps from
> the 50 runs were sorted and the tps value is displayed for each.
> I'm planning on pushing the v7-0001 patch tomorrow and will look
> further at the 0002 after that.

I did some further benchmarking of this. The Apple M2 machine I have
here saw a 6.8% speedup on the above test and the AMD3990x machine saw
1.93%. That was over 50 runs, starting and stopping Postgres each run.
Graphs attached.

I've now pushed v7-0001.


