From: | Andres Freund <andres(at)2ndquadrant(dot)com> |
---|---|
To: | Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> |
Cc: | pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Robert Haas <robertmhaas(at)gmail(dot)com> |
Subject: | Re: Wait free LW_SHARED acquisition - v0.9 |
Date: | 2014-10-10 07:57:56 |
Message-ID: | 20141010075756.GL29124@awork2.int |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hi,
On 2014-10-10 10:13:03 +0530, Amit Kapila wrote:
> I have done few performance tests for above patches and results of
> same is as below:
Cool, thanks.
> Performance Data
> ------------------------------
> IBM POWER-7 16 cores, 64 hardware threads
> RAM = 64GB
> max_connections =210
> Database Locale =C
> checkpoint_segments=256
> checkpoint_timeout =35min
> shared_buffers=8GB
> Client Count = number of concurrent sessions and threads (ex. -c 8 -j 8)
> Duration of each individual run = 5mins
> Test type - read only pgbench with -M prepared
> Other Related information about test
> a. This is the data for median of 3 runs, the detailed data of individual
> run
> is attached with mail.
> b. I have applied both the patches to take performance data.
>
> Scale Factor - 100
>
> Patch_ver/Client_count 1 8 16 32 64 128 HEAD 13344 106921 196629 295123
> 377846 333928 PATCH 13662 106179 203960 298955 452638 465671
>
> Scale Factor - 3000
>
> Patch_ver/Client_count 8 16 32 64 128 160 HEAD 86920 152417 231668
> 280827 257093 255122 PATCH 87552 160313 230677 276186 248609 244372
>
>
> Observations
> ----------------------
> a. The patch performs really well (increase upto ~40%) incase all the
> data fits in shared buffers (scale factor -100).
> b. Incase data doesn't fit in shared buffers, but fits in RAM
> (scale factor -3000), there is performance increase upto 16 client count,
> however after that it starts dipping (in above config unto ~4.4%).
Hm. Interesting. I don't see that dip on x86.
> The above data shows that the patch improves performance for cases
> when there is shared LWLock contention, however there is a slight
> performance dip in case of Exclusive LWLocks (at scale factor 3000,
> it needs exclusive LWLocks for buf mapping tables). Now I am not
> sure if this is the worst case dip or under similar configurations the
> performance dip can be higher, because the trend shows that dip is
> increasing with more client counts.
>
> Brief Analysis of code w.r.t performance dip
> ---------------------------------------------------------------------
> Extra Instructions w.r.t Head in Acquire Exclusive lock path
> a. Attempt lock twice
> b. atomic operations for nwaiters in LWLockQueueSelf() and
> LWLockAcquireCommon()
> c. Now we need to take spinlock twice, once for self queuing and then
> again for setting releaseOK.
> d. few function calls and some extra checks
Hm. I can't really see the number of atomics itself matter - a spinning
lock will do many more atomic ops than this. But I wonder whether we
could get rid of the releaseOK lock. Should be quite possible.
> Now probably these shouldn't matter much in case backend needs to
> wait for other Exclusive locker, but I am not sure what else could be
> the reason for dip in case we need to have Exclusive LWLocks.
Any chance to get a profile?
Greetings,
Andres Freund
--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
From | Date | Subject | |
---|---|---|---|
Next Message | Kyotaro HORIGUCHI | 2014-10-10 08:27:40 | alter user/role CURRENT_USER |
Previous Message | Peter Geoghegan | 2014-10-10 07:33:08 | Re: Obsolete reference to _bt_tuplecompare() within tuplesort.c |