From: | Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> |
---|---|
To: | Andres Freund <andres(at)anarazel(dot)de> |
Cc: | Jesper Pedersen <jesper(dot)pedersen(at)redhat(dot)com>, David Steele <david(at)pgmasters(dot)net>, Robert Haas <robertmhaas(at)gmail(dot)com>, Thom Brown <thom(at)linux(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Speed up Clog Access by increasing CLOG buffers |
Date: | 2016-03-24 02:38:02 |
Message-ID: | CAA4eK1JZ-V0R5PiY4gU17kq1fPcpDDeGcaUW+-OBDkUT6q5nwA@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Thu, Mar 24, 2016 at 5:40 AM, Andres Freund <andres(at)anarazel(dot)de> wrote:
>
> On 2016-03-23 21:43:41 +0100, Andres Freund wrote:
> > I'm playing around with SELECT txid_current(); right now - that should
> > be about the most specific load for setting clog bits.
>
> Or so I thought.
>
> In my testing that showed just about zero performance difference between
> the patch and master. And more surprisingly, profiling showed very
> little contention on the control lock. Hacking
> TransactionIdSetPageStatus() to return without doing anything, actually
> only showed minor performance benefits.
>
> [there's also the fact that txid_current() indirectly acquires two
> lwlock twice, which showed up more prominently than control lock, but
> that I could easily hack around by adding a xid_current().]
>
> Similar with an INSERT only workload. And a small scale pgbench.
>
>
> Looking through the thread showed that the positive results you'd posted
> all were with relativey big scale factors.
>
I have seen smaller benefits at 300 scale factor and somewhat larger
benefits at 1000 scale factor. Also Mithun has done similar testing with
unlogged tables and the results of same [1] also looks good.
>
> Which made me think. Running
> a bigger pgbench showed that most the interesting (i.e. long) lock waits
> were both via TransactionIdSetPageStatus *and* TransactionIdGetStatus().
>
Yes, this is same what I have observed as well.
>
> So I think what happens is that once you have a big enough table, the
> UPDATEs standard pgbench does start to often hit *old* xids (in unhinted
> rows). Thus old pages have to be read in, potentially displacing slru
> content needed very shortly after.
>
>
> Have you, in your evaluation of the performance of this patch, done
> profiles over time? I.e. whether the performance benefits are the
> immediately, or only after a significant amount of test time? Comparing
> TPS over time, for both patched/unpatched looks relevant.
>
I have mainly done it with half-hour read-write tests. What do you want to
observe via smaller tests, sometimes it gives inconsistent data for
read-write tests?
>
> Even after changing to scale 500, the performance benefits on this,
> older 2 socket, machine were minor; even though contention on the
> ClogControlLock was the second most severe (after ProcArrayLock).
>
I have tried this patch on mainly 8 socket machine with 300 & 1000 scale
factor. I am hoping that you have tried this test on unlogged tables and
by the way at what client count, you have seen these results.
> Afaics that squares with Jesper's result, which basically also didn't
> show a difference either way?
>
One difference was that I think Jesper has done testing with
synchronous_commit mode as off whereas my tests were with synchronous
commit mode on.
>
> I'm afraid that this patch might be putting bandaid on some of the
> absolutely worst cases, without actually addressing the core
> problem. Simon's patch in [1] seems to come closer addressing that
> (which I don't believe it's safe without going doing every status
> manipulation atomically, as individual status bits are smaller than 4
> bytes). Now it's possibly to argue that the bandaid might slow the
> bleeding to a survivable level, but I have to admit I'm doubtful.
>
> Here's the stats for a -s 500 run btw:
> Performance counter stats for 'system wide':
> 18,747 probe_postgres:TransactionIdSetTreeStatus
> 68,884 probe_postgres:TransactionIdGetStatus
> 9,718 probe_postgres:PGSemaphoreLock
> (the PGSemaphoreLock is over 50% ProcArrayLock, followed by ~15%
> SimpleLruReadPage_ReadOnly)
>
>
> My suspicion is that a better approach for now would be to take Simon's
> patch, but add a (per-page?) 'ClogModificationLock'; to avoid the need
> of doing something fancier in TransactionIdSetStatusBit().
>
I think we can try that as well and if you see better results by that
Approach, then we can use that instead of current patch.
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com
From | Date | Subject | |
---|---|---|---|
Next Message | Robert Haas | 2016-03-24 02:44:38 | Re: Relation extension scalability |
Previous Message | Peter Geoghegan | 2016-03-24 02:36:10 | Re: Using quicksort for every external sort run |