From: | Heikki Linnakangas <hlinnaka(at)iki(dot)fi> |
---|---|
To: | Melanie Plageman <melanieplageman(at)gmail(dot)com> |
Cc: | Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, Andres Freund <andres(at)anarazel(dot)de>, Robert Haas <robertmhaas(at)gmail(dot)com>, Peter Geoghegan <pg(at)bowt(dot)ie> |
Subject: | Re: Combine Prune and Freeze records emitted by vacuum |
Date: | 2024-04-01 14:17:51 |
Message-ID: | e7b3116c-79b0-4864-a8d2-5b59b498f724@iki.fi |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On 30/03/2024 07:57, Melanie Plageman wrote:
> On Fri, Mar 29, 2024 at 12:32:21PM -0400, Melanie Plageman wrote:
>> On Fri, Mar 29, 2024 at 12:00 PM Heikki Linnakangas <hlinnaka(at)iki(dot)fi> wrote:
>>> Here's another idea: In the first loop through the offsets, where we
>>> gather the HTSV status of each item, also collect the offsets of all HOT
>>> and non-HOT items to two separate arrays. Call heap_prune_chain() for
>>> all the non-HOT items first, and then process any remaining HOT tuples
>>> that haven't been marked yet.
>>
>> That's an interesting idea. I'll try it out and see how it works.
>
> Attached v10 implements this method of dividing tuples into HOT and
> non-HOT and processing the potential HOT chains first then processing
> tuples not marked by calling heap_prune_chain().
>
> I have applied the refactoring of heap_prune_chain() to master and then
> built the other patches on top of that.
Committed some of the changes. Continuing to reviewing the rest.
> I discovered while writing this that LP_DEAD item offsets must be in
> order in the deadoffsets array (the one that is used to populate
> LVRelState->dead_items).
>
> When I changed heap_page_prune_and_freeze() to partition the offsets
> into HOT and non-HOT during the first loop through the item pointers
> array (where we get tuple visibility information), we add dead item
> offsets as they are encountered. So, they are no longer in order. I've
> added a quicksort of the deadoffsets array to satisfy vacuum.
Good catch.
> I think that we are actually successfully removing more RECENTLY_DEAD
> HOT tuples than in master with heap_page_prune()'s new approach, and I
> think it is correct; but let me know if I am missing something.
Yep. In the case of a two-item chain, RECENTLY_DEAD -> DEAD, the new
code can always remove both items. On 'master', it depends on which item
it happens to process first. If it processes the RECENTLY_DEAD item
first, then it follows the chain and removes both. But if it processes
the DEAD item first, the RECENTLY_DEAD item is left behind. It will be
removed by the next VACUUM, so it's not a correctness issue, and
probably doesn't make any practical performance difference either as
it's a rare corner case, but I feel better that it's more deterministic now.
> The early patches in the set include some additional comment cleanup as
> well. 0001 is fairly polished. 0004 could use some variable renaming
> (this patch partitions the tuples into HOT and not HOT and then
> processes them). I was struggling with some of the names here
> (chainmembers and chaincandidates is confusing).
I didn't understand why you wanted to juggle both partitions in the same
array. So I separated them into two arrays, and called them 'root_items'
and 'heaponly_items'.
In some micro-benchmarks, the order that the items were processed made a
measurable difference. So I'm processing the items in the reverse order.
That roughly matches the order the items are processed in master, as it
iterates the offsets from high-to-low in the first loop, and low-to-high
in the second loop.
> The bulk of the combining of pruning and freezing is lumped into 0010.
>
> I had planned to separate 0010 into 4 separate patches: 1 to execute
> freezing in heap_prune_chain(), 1 for the freeze heuristic approximating
> what is on master, and 1 for emitting a single record containing both
> the pruning and freezing page modifications.
>
> I ended up not doing this because I felt like the grouping of changes in
> 0007-0009 is off. As long as I still execute freezing in
> lazy_scan_prune(), I have to share lots of state between
> lazy_scan_prune() and heap_page_prune(). This meant I added a lot of
> parameters to heap_page_prune() that later commits removed -- making the
> later patches noisy and not so easy to understand.
>
> I'm actually not sure what should go in what commit (either for review
> clarity or for the actual final version).
>
> But, I think we should probably focus on review of the code and not as
> much how it is split up yet.
Yeah, that's fine, 0010 is manageable-sized now.
> The final state of the code could definitely use more cleanup. I've been
> staring at it for awhile, so I could use some thoughts/ideas about what
> part to focus on improving.
Committed some of the changes. I plan to commit at least the first of
these remaining patches later today. I'm happy with it now, but I'll
give it a final glance over after dinner.
I'll continue to review the rest after that, but attached is what I have
now.
--
Heikki Linnakangas
Neon (https://neon.tech)
Attachment | Content-Type | Size |
---|---|---|
v11-0001-Handle-non-chain-tuples-outside-of-heap_prune_ch.patch | text/x-patch | 15.5 KB |
v11-0002-Invoke-heap_prune_record_prunable-during-record-.patch | text/x-patch | 8.2 KB |
v11-0003-Introduce-PRUNE_DO_-actions.patch | text/x-patch | 8.0 KB |
v11-0004-Prepare-freeze-tuples-in-heap_page_prune.patch | text/x-patch | 17.0 KB |
v11-0005-Set-hastup-in-heap_page_prune.patch | text/x-patch | 5.8 KB |
v11-0006-Save-dead-tuple-offsets-during-heap_page_prune.patch | text/x-patch | 7.3 KB |
v11-0007-Combine-freezing-and-pruning.patch | text/x-patch | 69.7 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Leung, Anthony | 2024-04-01 14:29:29 | Re: Allow non-superuser to cancel superuser tasks. |
Previous Message | David E. Wheeler | 2024-04-01 14:15:44 | Re: Security lessons from liblzma |