| From: | Amit Kapila <amit(dot)kapila(at)huawei(dot)com> |
|---|---|
| To: | "'Heikki Linnakangas'" <hlinnakangas(at)vmware(dot)com> |
| Cc: | <pgsql-hackers(at)postgresql(dot)org>, <noah(at)leadboat(dot)com> |
| Subject: | Re: Re: [WIP] Performance Improvement by reducing WAL for Update Operation |
| Date: | 2012-09-27 13:08:39 |
| Message-ID: | 00aa01cd9cb1$36c29a40$a447cec0$@kapila@huawei.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
> On Thursday, September 27, 2012 4:12 PM Heikki Linnakangas wrote:
> On 25.09.2012 18:27, Amit Kapila wrote:
> > If you feel it is must to do the comparison, we can do it in same way
> > as we identify for HOT?
>
> Yeah. (But as discussed, I think it would be even better to just treat
> the old and new tuple as an opaque chunk of bytes, and run them through
> a generic delta algorithm).
>
Thank you for the modified patch.
> The conclusion is that there isn't very much difference among the
> patches. They all squeeze the WAL to about the same size, and the
> increase in TPS is roughly the same.
>
> I think more performance testing is required. The modified pgbench test
> isn't necessarily very representative of a real-life application. The
> gain (or loss) of this patch is going to depend a lot on how many
> columns are updated, and in what ways. Need to test more scenarios,
> with many different database schemas.
>
> The LZ approach has the advantage that it can take advantage of all
> kinds of similarities between old and new tuple. For example, if you
> swap the values of two columns, LZ will encode that efficiently. Or if
> you insert a character in the middle of a long string. On the flipside,
> it's probably more expensive. Then again, you have to do a memcmp() to
> detect which columns have changed with your approach, and that's not
> free either. That was not yet included in the patch version I tested.
> Another consideration is that when you compress the record more, you
> have less data to calculate CRC for. CRC calculation tends to be quite
> expensive, so even quite aggressive compression might be a win. Yet
> another consideration is that the compression/encoding is done while
> holding a lock on the buffer. For the sake of concurrency, you want to
> keep the duration the lock is held as short as possible.
Now I shall do the various tests for following and post it here:
a. Attached Patch in the mode where it takes advantage of history tuple
b. By changing the logic for modified column calculation to use calculation
for memcmp()
With Regards,
Amit Kapila.
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Robert Haas | 2012-09-27 13:22:40 | Re: data to json enhancements |
| Previous Message | Heikki Linnakangas | 2012-09-27 10:41:46 | Re: Re: [WIP] Performance Improvement by reducing WAL for Update Operation |