From: | Amit Kapila <amit(dot)kapila(at)huawei(dot)com> |
---|---|
To: | "'Heikki Linnakangas'" <heikki(dot)linnakangas(at)enterprisedb(dot)com> |
Cc: | <pgsql-hackers(at)postgresql(dot)org> |
Subject: | [WIP] Performance Improvement by reducing WAL for Update Operation |
Date: | 2012-08-04 08:01:29 |
Message-ID: | 002101cd7217$5b38dd40$11aa97c0$@kapila@huawei.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Missed one point which needs to be handled is pg_upgrade
-----Original Message-----
From: pgsql-hackers-owner(at)postgresql(dot)org
[mailto:pgsql-hackers-owner(at)postgresql(dot)org] On Behalf Of Amit Kapila
Sent: Saturday, August 04, 2012 12:12 PM
To: 'Heikki Linnakangas'
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: [HACKERS] [WIP] Performance Improvement by reducing WAL for
Update Operation
From: Heikki Linnakangas [mailto:heikki(dot)linnakangas(at)enterprisedb(dot)com]
Sent: Saturday, August 04, 2012 1:33 AM
On 03.08.2012 14:46, Amit kapila wrote:
>> Currently the change is done only for fixed length columns for simple
tables and the tuple should not contain NULLS.
>
>> This is a Proof of concept, the design and implementation needs to be
changed based on final design required for handling other scenario's
>>
>> Update operation:
>> -----------------------------
>> 1. Check for the simple table or not.(No toast, No before update
triggers)
>> 2. Works only for not null tuples.
>> 3. Identify the modified columns from the target entry.
>> 4. Based on the modified column list, check for any variable length
columns are modified, if so this optimization is not applied.
>> 5. Identify the offset and length for the modified columns and store it
as an optimized WAL tuple in the following format.
>> Note: Wal update header is modified to denote whether wal update
optimization is done or not.
>> WAL update header + Tuple header(no change from previous format)
+
>> [offset(2bytes)] [length(2 bytes)] [changed data value]
>> [offset(2bytes)] [length(2 bytes)] [changed data value]
>> ....
>> ....
> The performance will need to be re-verified after you fix these
> limitations. Those limitations need to be fixed before this can be
applied.
Yes, I agree that solution should fix these limitations and performance
numbers needs to be re-verified.
Currently in my mind the work to be done is as follows:
1. Solution which can handle Variable length columns and NULLs
2. Handling of Before Triggers
3. Can the solution for fixed length columns be same as Variable length
columns and NULLS.
4. Make the final code patch which addresses all the above.
Please suggest if there are more things that needs to be handled?
For the 3rd point, currently the solution for fixed length columns cannot
handle the case of variable length columns and NULLS. The reason is for
fixed length columns there is no need of diff technology between old and new
tuple, however for other cases it will be required.
For fixed length columns, if we just note the OFFSET, LENGTH, VALUE of
changed columns of new tuple in WAL, it will be sufficient to do the replay
of WAL. However to handle other cases we need to use diff mechanism.
Can we do something like if the changed columns are fixed length and doesn't
contain NULL's, then store [OFFSET, LENGTH, VALUE] format in WAL and for
other cases store diff format.
This has advantage that for Updates containing only fixed length columns
don't have to pay penality of doing diff between new and old tuple. Also we
can do the whole work in 2 parts, one for fixed length columns and second to
handle other cases.
> It would be nice to use some well-known binary delta algorithm for this,
> rather than invent our own. OTOH, we have more knowledge of the
> attribute boundaries, so a custom algorithm might work better.
I shall work on this and post after initial work.
> In any case, I'd like to see the code to do the delta encoding/decoding to
be
> put into separate functions, outside of heapam.c. It would be good for
> readability, and we might want to reuse this in other places too.
Agreed. I shall take care of doing it in suggested way.
With Regards,
Amit Kapila.
From | Date | Subject | |
---|---|---|---|
Next Message | Alexander Korotkov | 2012-08-04 09:31:07 | Statistics and selectivity estimation for ranges |
Previous Message | Amit Kapila | 2012-08-04 06:41:44 | Re: [WIP] Performance Improvement by reducing WAL for Update Operation |