Re: User concurrency thresholding: where do I look?

From: "Jignesh K(dot) Shah" <J(dot)K(dot)Shah(at)Sun(dot)COM>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Simon Riggs <simon(at)2ndquadrant(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>, pgsql-performance(at)postgresql(dot)org, Alvaro Herrera <alvherre(at)commandprompt(dot)com>
Subject: Re: User concurrency thresholding: where do I look?
Date: 2007-07-26 14:29:51
Message-ID: 46A8AFDF.3050804@sun.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

The count is only for a 10-second snapshot.. Plus remember there are
about 1000 users running so the connection being profiled only gets
0.01 of the period on CPU.. And the count is for that CONNECTION only.

Anyway using the lock wait script it shows the real picture as you
requested. Here the combined time means time "spent waiting" for the lock.

bash-3.00# echo 500 users
500 users
bash-3.00# ./4_lwlock_waits.d 20764

Lock Id Mode Count
OidGenLock Exclusive 1
CheckpointStartLock Shared 3
OidGenLock Shared 4
WALInsertLock Exclusive 7
FreeSpaceLock Exclusive 8
XidGenLock Exclusive 15
CheckpointStartLock Exclusive 16

Lock Id Combined Time (ns)
XidGenLock 3825000
OidGenLock 5307100
WALInsertLock 6317800
FreeSpaceLock 7244100
CheckpointStartLock 22199200

bash-3.00# echo 600 users
600 users
bash-3.00# ./4_lwlock_waits.d 20764

Lock Id Mode Count
OidGenLock Exclusive 1
WALInsertLock Exclusive 1
CheckpointStartLock Shared 4
CheckpointStartLock Exclusive 11
XidGenLock Exclusive 21

Lock Id Combined Time (ns)
OidGenLock 1728200
WALInsertLock 2040700
XidGenLock 19878500
CheckpointStartLock 24156500

bash-3.00# echo 700 users
700 users
bash-3.00# ./4_lwlock_waits.d 20764

Lock Id Mode Count
OidGenLock Shared 1
XidGenLock Shared 1
CheckpointStartLock Shared 2
WALInsertLock Exclusive 4
CheckpointStartLock Exclusive 6
FreeSpaceLock Exclusive 6
XidGenLock Exclusive 13

Lock Id Combined Time (ns)
OidGenLock 1730000
WALInsertLock 7253400
FreeSpaceLock 10977700
CheckpointStartLock 13356800
XidGenLock 38220500

bash-3.00# echo 800 users
800 users
bash-3.00# ./4_lwlock_waits.d 20764

Lock Id Mode Count
CheckpointStartLock Shared 1
WALInsertLock Exclusive 2
XidGenLock Shared 2
CheckpointStartLock Exclusive 5
FreeSpaceLock Exclusive 8
XidGenLock Exclusive 12

Lock Id Combined Time (ns)
WALInsertLock 3746800
FreeSpaceLock 7628900
CheckpointStartLock 11297500
XidGenLock 16649000

bash-3.00# echo 900 users - BIG DROP IN THROUGHPUT OCCURS...
900 users
bash-3.00# ./4_lwlock_waits.d 20764

Lock Id Mode Count
OidGenLock Exclusive 1
OidGenLock Shared 1
XidGenLock Shared 1
FreeSpaceLock Exclusive 2
WALInsertLock Exclusive 2
CheckpointStartLock Shared 6
XidGenLock Exclusive 12
CheckpointStartLock Exclusive 121

Lock Id Combined Time (ns)
OidGenLock 1968100
FreeSpaceLock 2076300
WALInsertLock 2190400
XidGenLock 20259400
CheckpointStartLock 1407294300

bash-3.00# echo 950 users
950 users
bash-3.00# ./4_lwlock_waits.d 20764

Lock Id Mode Count
OidGenLock Exclusive 1
OidGenLock Shared 2
CheckpointStartLock Shared 3
WALInsertLock Exclusive 4
FreeSpaceLock Exclusive 5
XidGenLock Exclusive 11
CheckpointStartLock Exclusive 50

Lock Id Combined Time (ns)
WALInsertLock 5577100
FreeSpaceLock 9115900
XidGenLock 13765800
OidGenLock 50155500
CheckpointStartLock 759872200

bash-3.00# echo 1000 users
1000 users
bash-3.00# ./4_lwlock_waits.d 20764

Lock Id Mode Count
OidGenLock Shared 1
WALInsertLock Exclusive 1
XidGenLock Exclusive 5
CheckpointStartLock Shared 6
CheckpointStartLock Exclusive 102

Lock Id Combined Time (ns)
OidGenLock 21900
WALInsertLock 82500
XidGenLock 3313400
CheckpointStartLock 1448289900

bash-3.00# echo 1050 users
1050 users
bash-3.00# ./4_lwlock_waits.d 20764

Lock Id Mode Count
FreeSpaceLock Exclusive 1
CheckpointStartLock Shared 3
XidGenLock Exclusive 3
CheckpointStartLock Exclusive 146

Lock Id Combined Time (ns)
FreeSpaceLock 18400
XidGenLock 1900900
CheckpointStartLock 2392893700

bash-3.00#

-Jignesh

Tom Lane wrote:
> "Jignesh K. Shah" <J(dot)K(dot)Shah(at)Sun(dot)COM> writes:
>
>> Here is how I got the numbers..
>> I had about 1600 users login into postgresql. Then started the run with
>> 500 users and using DTrace I started tracking Postgresql Locking "as
>> viewed from one user/connection". Echo statements indicate how many
>> users were active at that point and how was throughput performing. All
>> IO is done on /tmp which means on a RAM disk.
>>
>
>
>> bash-3.00# echo 500 users - baseline number
>> 500 users
>> bash-3.00# ./3_lwlock_acquires.d 19178
>>
> I don't think I believe these numbers. For one thing, CheckpointLock
> is simply not ever taken in shared mode. The ratios of counts for
> different locks seems pretty improbable too, eg there is no way on
> earth that the LockMgrLocks are taken more often shared than
> exclusive (I would expect no shared acquires at all in the sort of
> test you are running). Not to mention that the absolute number of
> counts seems way too low. So I think the counting tool is broken.
>
>
> Combined time of what exactly? It looks like this must be the total
> duration the lock is held, at least assuming that the time for
> CheckpointLock is correctly reported. It'd be much more useful to see
> the total time spent waiting to acquire the lock.
>
> regards, tom lane
>
> ---------------------------(end of broadcast)---------------------------
> TIP 3: Have you checked our extensive FAQ?
>
> http://www.postgresql.org/docs/faq
>

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Mark Lewis 2007-07-26 14:37:23 Re: disk filling up
Previous Message Bill Moran 2007-07-26 13:34:32 Re: disk filling up