From: | Heikki Linnakangas <hlinnakangas(at)vmware(dot)com> |
---|---|
To: | Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> |
Cc: | Robert Haas <robertmhaas(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Mike Blackwell <mike(dot)blackwell(at)rrd(dot)com> |
Subject: | Re: Performance Improvement by reducing WAL for Update Operation |
Date: | 2014-02-05 11:43:28 |
Message-ID: | 52F223E0.6030306@vmware.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On 02/05/2014 07:54 AM, Amit Kapila wrote:
> On Tue, Feb 4, 2014 at 11:58 PM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>> On Tue, Feb 4, 2014 at 12:39 PM, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
>>> Now there is approximately 1.4~5% CPU gain for
>>> "hundred tiny fields, half nulled" case
>
>> Assuming that the logic isn't buggy, a point in need of further study,
>> I'm starting to feel like we want to have this. And I might even be
>> tempted to remove the table-level off switch.
>
> I have tried to stress on worst case more, as you are thinking to
> remove table-level switch and found that even if we increase the
> data by approx. 8 times ("ten long fields, all changed", each field contains
> 80 byte data), the CPU overhead is still < 5% which clearly shows that
> the overhead doesn't increase much even if the length of unmatched data
> is increased by much larger factor.
> So the data for worst case adds more weight to your statement
> ("remove table-level switch"), however there is no harm in keeping
> table-level option with default as 'true' and if some users are really sure
> the updates in their system will have nothing in common, then they can
> make this new option as 'false'.
>
> Below is data for the new case " ten long fields, all changed" added
> in attached script file:
That's not the worst case, by far.
First, note that the skipping while scanning new tuple is only performed
in the first loop. That means that as soon as you have a single match,
you fall back to hashing every byte. So for the worst case, put one
4-byte field as the first column, and don't update it.
Also, I suspect the runtimes in your test were dominated by I/O. When I
scale down the number of rows involved so that the whole test fits in
RAM, I get much bigger differences with and without the patch. You might
also want to turn off full_page_writes, to make the effect clear with
less data.
So, I came up with the attached worst case test, modified from your
latest test suite.
unpatched:
testname | wal_generated | duration
--------------------------------------+---------------+------------------
ten long fields, all but one changed | 343385312 | 2.20806908607483
ten long fields, all but one changed | 336263592 | 2.18997097015381
ten long fields, all but one changed | 336264504 | 2.17843413352966
(3 rows)
pgrb_delta_encoding_v8.patch:
testname | wal_generated | duration
--------------------------------------+---------------+------------------
ten long fields, all but one changed | 338356944 | 3.33501315116882
ten long fields, all but one changed | 344059272 | 3.37364101409912
ten long fields, all but one changed | 336257840 | 3.36244201660156
(3 rows)
So with this test, the overhead is very significant.
With the skipping logic, another kind of "worst case" case is that you
have a lot of similarity between the old and new tuple, but you miss it
because you skip. For example, if you change the first few columns, but
leave a large text column at the end of the tuple unchanged.
- Heikki
Attachment | Content-Type | Size |
---|---|---|
wal-update-testsuite.sh | application/x-sh | 14.9 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Heikki Linnakangas | 2014-02-05 11:59:47 | Re: Performance Improvement by reducing WAL for Update Operation |
Previous Message | Alexander Korotkov | 2014-02-05 10:48:25 | Re: Fix picksplit with nan values |