Re: [HACKERS] [PATCH] Vacuum: Update FSM more frequently

From: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
To: Claudio Freire <klaussfreire(at)gmail(dot)com>
Cc: PostgreSQL-Dev <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [HACKERS] [PATCH] Vacuum: Update FSM more frequently
Date: 2018-02-14 06:59:31
Message-ID: CAD21AoDPKT30BL=y_OOhMSakvmHzCU=k_gKcJUmTDwhHen+2-g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Feb 9, 2018 at 11:48 PM, Claudio Freire <klaussfreire(at)gmail(dot)com> wrote:
> On Fri, Feb 9, 2018 at 1:36 AM, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
>> On Fri, Feb 9, 2018 at 12:45 AM, Claudio Freire <klaussfreire(at)gmail(dot)com> wrote:
>>> On Thu, Feb 8, 2018 at 1:36 AM, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
>>>> On Tue, Feb 6, 2018 at 9:51 PM, Claudio Freire <klaussfreire(at)gmail(dot)com> wrote:
>>>>> I can look into doing 3, that *might* get rid of the need to do that
>>>>> initial FSM vacuum, but all other intermediate vacuums are still
>>>>> needed.
>>>>
>>>> Understood. So how about that this patch focuses only make FSM vacuum
>>>> more frequently and leaves the initial FSM vacuum and the handling
>>>> cancellation cases? The rest can be a separate patch.
>>>
>>> Ok.
>>>
>>> Attached split patches. All but the initial FSM pass is in 1, the
>>> initial FSM pass as in the original patch is in 2.
>>>
>>> I will post an alternative patch for 2 when/if I have one. In the
>>> meanwhile, we can talk about 1.
>>
>> Thank you for updating the patch!
>>
>> + /* Tree pruning for partial vacuums */
>> + if (threshold)
>> + {
>> + child_avail = fsm_get_avail(page, slot);
>> + if (child_avail >= threshold)
>> + continue;
>> + }
>>
>> I think slots in a non-leaf page might not have a correct value
>> because fsm_set_avail doesn't propagate up beyond pages. So do we need
>> to check the root of child page rather than a slot in the non-lead
>> page? I might be missing something though.
>
> I'm not sure I understand what you're asking.
>
> Yes, a partial FSM vacuum won't correct those inconsistencies. It
> doesn't matter though, because as long as the root node (or whichever
> inner node we're looking at the time) reports free space, the lookup
> for free space will recurse and find the lower nodes, even if they
> report more space than the inner node reported. The goal of making
> free space discoverable will have been achieved nonetheless.

For example, if a slot of internal node of fsm tree has a value
greater than the threshold while the root of leaf node of fsm tree has
a value less than threshold, we want to update the internal node of
fsm tree but we will skip it with your patch. I wonder if this can
happen.

> The final FSM vacuum pass isn't partial, to finally correct all those
> small inconsistencies.

Yes, but the purpose of this patch is to prevent table bloating from
concurrent modifications?

Here is other review comments.

+ /* Tree pruning for partial vacuums */
+ if (threshold)
+ {
+ child_avail = fsm_get_avail(page, slot);
+ if (child_avail >= threshold)
+ continue;
+ }

'child_avail' is a category value while 'threshold' is an actual size
of freespace.

The logic of finding the max_freespace seems not work fine if table
with indices because max_freespace is not updated based on
post-compaction freespace. I think we need to get the max freespace
size during vacuuming heap (i.g. at lazy_vacuum_heap) and use it as
the threshold.

+ vacuum_fsm_every_pages = nblocks / 8;
+ vacuum_fsm_every_pages = Max(vacuum_fsm_every_pages, 1048576);

I think a comment for 1048576 is needed. And perhaps we can define it
as a macro.

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Masahiko Sawada 2018-02-14 07:00:14 Re: Typo in origin.c
Previous Message Rafia Sabih 2018-02-14 06:47:37 Re: [HACKERS] Partition-wise aggregation/grouping