From: | Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> |
---|---|
To: | Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com> |
Cc: | Robert Haas <robertmhaas(at)gmail(dot)com>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Speed up Clog Access by increasing CLOG buffers |
Date: | 2016-10-21 11:29:15 |
Message-ID: | CAA4eK1L3iq8CQztz9SfG-5iJo2PLxHOV0jnWCspA7cFvoqJ6gQ@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Fri, Oct 21, 2016 at 1:07 PM, Tomas Vondra
<tomas(dot)vondra(at)2ndquadrant(dot)com> wrote:
> On 10/21/2016 08:13 AM, Amit Kapila wrote:
>>
>> On Fri, Oct 21, 2016 at 6:31 AM, Robert Haas <robertmhaas(at)gmail(dot)com>
>> wrote:
>>>
>>> On Thu, Oct 20, 2016 at 4:04 PM, Tomas Vondra
>>> <tomas(dot)vondra(at)2ndquadrant(dot)com> wrote:
>>>>>
>>>>> I then started a run at 96 clients which I accidentally killed shortly
>>>>> before it was scheduled to finish, but the results are not much
>>>>> different; there is no hint of the runaway CLogControlLock contention
>>>>> that Dilip sees on power2.
>>>>>
>>>> What shared_buffer size were you using? I assume the data set fit into
>>>> shared buffers, right?
>>>
>>>
>>> 8GB.
>>>
>>>> FWIW as I explained in the lengthy post earlier today, I can actually
>>>> reproduce the significant CLogControlLock contention (and the patches do
>>>> reduce it), even on x86_64.
>>>
>>>
>>> /me goes back, rereads post. Sorry, I didn't look at this carefully
>>> the first time.
>>>
>>>> For example consider these two tests:
>>>>
>>>> * http://tvondra.bitbucket.org/#dilip-300-unlogged-sync
>>>> * http://tvondra.bitbucket.org/#pgbench-300-unlogged-sync-skip
>>>>
>>>> However, it seems I can also reproduce fairly bad regressions, like for
>>>> example this case with data set exceeding shared_buffers:
>>>>
>>>> * http://tvondra.bitbucket.org/#pgbench-3000-unlogged-sync-skip
>>>
>>>
>>> I'm not sure how seriously we should take the regressions. I mean,
>>> what I see there is that CLogControlLock contention goes down by about
>>> 50% -- which is the point of the patch -- and WALWriteLock contention
>>> goes up dramatically -- which sucks, but can't really be blamed on the
>>> patch except in the indirect sense that a backend can't spend much
>>> time waiting for A if it's already spending all of its time waiting
>>> for B.
>>>
>>
>> Right, I think not only WALWriteLock, but contention on other locks
>> also goes up as you can see in below table. I think there is nothing
>> much we can do for that with this patch. One thing which is unclear
>> is why on unlogged tests it is showing WALWriteLock?
>>
>
> Well, although we don't write the table data to the WAL, we still need to
> write commits and other stuff, right?
>
We do need to write commit, but do we need to flush it immediately to
WAL for unlogged tables? It seems we allow WALWriter to do that,
refer logic in RecordTransactionCommit.
And on scale 3000 (which exceeds the
> 16GB shared buffers in this case), there's a continuous stream of dirty
> pages (not to WAL, but evicted from shared buffers), so iostat looks like
> this:
>
> time tps wr_sec/s avgrq-sz avgqu-sz await %util
> 08:48:21 81654 1367483 16.75 127264.60 1294.80 97.41
> 08:48:31 41514 697516 16.80 103271.11 3015.01 97.64
> 08:48:41 78892 1359779 17.24 97308.42 928.36 96.76
> 08:48:51 58735 978475 16.66 92303.00 1472.82 95.92
> 08:49:01 62441 1068605 17.11 78482.71 1615.56 95.57
> 08:49:11 55571 945365 17.01 113672.62 1923.37 98.07
> 08:49:21 69016 1161586 16.83 87055.66 1363.05 95.53
> 08:49:31 54552 913461 16.74 98695.87 1761.30 97.84
>
> That's ~500-600 MB/s of continuous writes. I'm sure the storage could handle
> more than this (will do some testing after the tests complete), but surely
> the WAL has to compete for bandwidth (it's on the same volume / devices).
> Another thing is that we only have 8 WAL insert locks, and maybe that leads
> to contention with such high client counts.
>
Yeah, quite possible, but I don't think increasing that would benefit
in general, because while writing WAL we need to take all the
wal_insert locks. In any case, I think that is a separate problem to
study.
--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com
From | Date | Subject | |
---|---|---|---|
Next Message | Vladimir Gordiychuk | 2016-10-21 11:38:04 | Re: Stopping logical replication protocol |
Previous Message | Amit Kapila | 2016-10-21 10:57:55 | Re: Parallel tuplesort (for parallel B-Tree index creation) |