From: | Jeff Janes <jeff(dot)janes(at)gmail(dot)com> |
---|---|
To: | Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> |
Cc: | pgsql-hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Hash Indexes |
Date: | 2016-09-14 23:14:08 |
Message-ID: | CAMkU=1zd1KXNHQxRD6fBx4+Vu5Hnt2H7vzH1KfGAU9VoQcQy8A@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Tue, May 10, 2016 at 5:09 AM, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
wrote:
>
>
> Although, I don't think it is a very good idea to take any performance
> data with WIP patch, still I couldn't resist myself from doing so and below
> are the performance numbers. To get the performance data, I have dropped
> the primary key constraint on pgbench_accounts and created a hash index on
> aid column as below.
>
> alter table pgbench_accounts drop constraint pgbench_accounts_pkey;
> create index pgbench_accounts_pkey on pgbench_accounts using hash(aid);
>
To be rigorously fair, you should probably replace the btree primary key
with a non-unique btree index and use that in the btree comparison case. I
don't know how much difference that would make, probably none at all for a
read-only case.
>
>
> Below data is for read-only pgbench test and is a median of 3 5-min runs.
> The performance tests are executed on a power-8 m/c.
>
With pgbench -S where everything fits in shared_buffers and the number of
cores I have at my disposal, I am mostly benchmarking interprocess
communication between pgbench and the backend. I am impressed that you can
detect any difference at all.
For this type of thing, I like to create a server side function for use in
benchmarking:
create or replace function pgbench_query(scale integer,size integer)
RETURNS integer AS $$
DECLARE sum integer default 0;
amount integer;
account_id integer;
BEGIN FOR i IN 1..size LOOP
account_id=1+floor(random()*scale);
SELECT abalance into strict amount FROM pgbench_accounts
WHERE aid = account_id;
sum := sum + amount;
END LOOP;
return sum;
END $$ LANGUAGE plpgsql;
And then run using a command like this:
pgbench -f <(echo 'select pgbench_query(40,1000)') -c$j -j$j -T 300
Where the first argument ('40', here) must be manually set to the same
value as the scale-factor.
With 8 cores and 8 clients, the values I get are, for btree, hash-head,
hash-concurrent, hash-concurrent-cache, respectively:
598.2
577.4
668.7
664.6
(each transaction involves 1000 select statements)
So I do see that the concurrency patch is quite an improvement. The cache
patch does not produce a further improvement, which was somewhat surprising
to me (I thought that that patch would really shine in a read-write
workload, but I expected at least improvement in read only)
I've run this was 128MB shared_buffers and scale factor 40. Not everything
fits in shared_buffers, but quite easily fits in RAM, and there is no
meaningful IO caused by the benchmark.
Cheers,
Jeff
From | Date | Subject | |
---|---|---|---|
Next Message | Petr Jelinek | 2016-09-14 23:15:29 | Re: Logical Replication WIP |
Previous Message | Thomas Munro | 2016-09-14 23:04:15 | Re: kqueue |