Re: Should we cacheline align PGXACT?

From: Alexander Korotkov <a(dot)korotkov(at)postgrespro(dot)ru>
To: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Should we cacheline align PGXACT?
Date: 2017-02-11 18:31:21
Message-ID: CAPpHfdt1hrUr4DcAr9HQjXGLqouLqAmBHwjkUu1EctmnhhQB3g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sat, Feb 11, 2017 at 4:17 PM, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
wrote:

> On 02/11/2017 01:21 PM, Alexander Korotkov wrote:
>
>> Hi, Tomas!
>>
>> On Sat, Feb 11, 2017 at 2:28 AM, Tomas Vondra
>> <tomas(dot)vondra(at)2ndquadrant(dot)com <mailto:tomas(dot)vondra(at)2ndquadrant(dot)com>>
>> wrote:
>>
>> As discussed at the Developer meeting ~ a week ago, I've ran a
>> number of benchmarks on the commit, on a small/medium-size x86
>> machines. I currently don't have access to a machine as big as used
>> by Alexander (with 72 physical cores), but it seems useful to verify
>> the patch does not have negative impact on smaller machines.
>>
>> In particular I've ran these tests:
>>
>> * r/o pgbench
>> * r/w pgbench
>> * 90% reads, 10% writes
>> * pgbench with skewed distribution
>> * pgbench with skewed distribution and skipping
>>
>>
>> Thank you very much for your efforts!
>> I took a look at these tests. One thing catch my eyes. You warmup
>> database using pgbench run. Did you consider using pg_prewarm instead?
>>
>> SELECT sum(x.x) FROM (SELECT pg_prewarm(oid) AS x FROM pg_class WHERE
>> relkind IN ('i', 'r') ORDER BY oid) x;
>>
>> In my experience pg_prewarm both takes less time and leaves less
>> variation afterwards.
>>
>>
> I've considered it, but the problem I see in using pg_prewarm for
> benchmarking purposes is that it only loads the data into memory, but it
> does not modify the tuples (so all tuples have the same xmin/xmax, no dead
> tuples, ...), it does not set usage counters on the buffers and also does
> not generate any clog records.
>

Yes, but please note that pgbench runs VACUUM first, and all the tuples
would be hinted. In order to xmin/xmax and clog take effect, you should
run subsequent pgbench with --no-vacuum. Also usage counters of buffers
make sense only when eviction happens, i.e. data don't fit shared_buffers.

Also, you use pgbench -S to warmup before readonly test. I think
pg_prewarm would be much better there unless your data is bigger than
shared_buffers.

I don't think there's a lot of variability in the results I measured. If
> you look at (max-min) for each combination of parameters, the delta is
> generally within 2% of average, with a very few exceptions, usually caused
> by the first run (so perhaps the warmup should be a bit longer).

You also could run pg_prewarm before warmup pgbench for readwrite test. In
my intuition you should get more stable results with shorter warmup.

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Ryan Murphy 2017-02-11 18:54:28 Re: Access inside pg_node_tree from query?
Previous Message Robert Haas 2017-02-11 16:11:05 Re: Parallel Index Scans