Re: SSI slows down over time

From: Ryan Johnson <scovich(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: SSI slows down over time
Date: 2014-04-07 14:11:37
Message-ID: 5342B219.9040001@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On 06/04/2014 10:55 AM, Tom Lane wrote:
> Ryan Johnson <ryan(dot)johnson(at)cs(dot)utoronto(dot)ca> writes:
>> I get a strange behavior across repeated runs: each 100-second run is a
>> bit slower than the one preceding it, when run with SSI (SERIALIZABLE).
>> ... So the question: what should I look for to diagnose/triage this problem?
> In the past I've seen behaviors like this that traced to the range of
> "interesting" transaction IDs getting wider as time went on, so that
> more pages of pg_clog were hot, leading to more I/O traffic in the
> clog SLRU buffers. Possibly this is some effect like that.
The effect disappears completely if I run under SI instead of SSI,
though. That makes me suspect strongly that the problem lurks in
SSI-specific infrastrucure.

However, I did notice that the SLRU buffer that holds "old" SSI
transactions sometimes spikes from hundreds to millions of entries (by
annotating the source to ereport a warning whenever the difference
between buffer head and tale is at least 10% higher than the previous
record). Not sure if that's related, though: I'm pretty sure SSI never
scans SLRU, it's only used for random lookups.

>> I've tried linux perf, but all it says is that lots of time is going to
>> LWLock (but callgraph tracing doesn't work in my not-bleeding-edge
>> kernel).
> You could recompile with -DLWLOCK_STATS to enable gathering stats on
> which LWLocks are heavily contended. That would be a starting point
> for localizing the cause.

Here are the offenders (100-second run, 24 clients, ~2.2ktps):
> lwlock 7 shacq 0 exacq 7002810 blk 896235 spindelay 213
> lwlock 28 shacq 94984166 exacq 3938085 blk 572762 spindelay 163
> lwlock 65 shacq 3347000 exacq 2933440 blk 255927 spindelay 90
> lwlock 79 shacq 1742574 exacq 3009663 blk 216178 spindelay 41
> lwlock 76 shacq 2293386 exacq 2892242 blk 205199 spindelay 70
> lwlock 66 shacq 498909 exacq 2987485 blk 171508 spindelay 48
> lwlock 80 shacq 512107 exacq 3181753 blk 165772 spindelay 43
> lwlock 71 shacq 815733 exacq 3088157 blk 165579 spindelay 48
> lwlock 74 shacq 603321 exacq 3065391 blk 159953 spindelay 56
> lwlock 67 shacq 695465 exacq 2918970 blk 149339 spindelay 28
> lwlock 69 shacq 411203 exacq 3044007 blk 148655 spindelay 34
> lwlock 72 shacq 515260 exacq 2973321 blk 147533 spindelay 43
> lwlock 30 shacq 41628636 exacq 8799 blk 143889 spindelay 186
> lwlock 75 shacq 409472 exacq 2987227 blk 143196 spindelay 38
> lwlock 77 shacq 409401 exacq 2946972 blk 139507 spindelay 34
> lwlock 73 shacq 402544 exacq 2943467 blk 139380 spindelay 43
> lwlock 78 shacq 404220 exacq 2912665 blk 137625 spindelay 21
> lwlock 70 shacq 603643 exacq 2816730 blk 135851 spindelay 37
> lwlock 68 shacq 403533 exacq 2862017 blk 131946 spindelay 30
> lwlock 29 shacq 0 exacq 255302 blk 75838 spindelay 1
> lwlock 0 shacq 0 exacq 561508 blk 12445 spindelay 3
> lwlock 11 shacq 1245499 exacq 219717 blk 5501 spindelay 10
> lwlock 4 shacq 381211 exacq 209146 blk 1273 spindelay 4
> lwlock 3 shacq 16 exacq 209081 blk 522 spindelay 0
> lwlock 8 shacq 0 exacq 137961 blk 50 spindelay 0
> lwlock 2097366 shacq 0 exacq 384586 blk 1 spindelay 0
> lwlock 2097365 shacq 0 exacq 370176 blk 1 spindelay 0
> lwlock 2097359 shacq 0 exacq 363845 blk 1 spindelay 0

The above aggregates the per-lock stats from all processes, filters out
locks with fewer than 10000 accesses (shared+exclusive) or with zero
blk, then sorts by highest blk first.

According to [1], locks {28, 29, 30} are {SerializableXactHashLock,
SerializableFinishedListLock, SerializablePredicateLockListLock}, all
SSI-related; locks 65-80 are the sixteen PredicateLockMgrLocks that the
post mentions. Looking in lwlock.h, lock 7 (which tops the list) is the
WALInsertLock. That lock was *not* mentioned in the pgsql-hackers post.

Re-running the same analysis for SI instead of SSI gives 4.6ktps and a
much shorter list:
> lwlock 7 shacq 0 exacq 14050121 blk 3429384 spindelay 347
> lwlock 11 shacq 3133994 exacq 450325 blk 23456 spindelay 29
> lwlock 0 shacq 0 exacq 684775 blk 19158 spindelay 3
> lwlock 4 shacq 780846 exacq 428771 blk 4539 spindelay 6
> lwlock 3 shacq 19 exacq 428705 blk 1147 spindelay 0
> lwlock 59 shacq 0 exacq 125943 blk 203 spindelay 0
> lwlock 8 shacq 0 exacq 287629 blk 116 spindelay 0
> lwlock 2097358 shacq 0 exacq 752361 blk 1 spindelay 0
> lwlock 2097355 shacq 0 exacq 755086 blk 1 spindelay 0
> lwlock 2097352 shacq 0 exacq 760232 blk 1 spindelay 0

However, all of this only confirms that SSI has a lock bottleneck; it
doesn't say why the bottleneck gets worse over time.

[1]
http://www.postgresql.org/message-id/CA+TgmoYAiSM2jWEndReY5PL0sKbhgg7dbDH6r=oXKYzi9B7KJA@mail.gmail.com

Thoughts?
Ryan

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message Ryan Johnson 2014-04-07 14:36:14 Re: SSI slows down over time
Previous Message Graeme B. Bell 2014-04-07 14:05:11 Re: PGSQL 9.3 - Materialized View - multithreading