Re: Block level parallel vacuum WIP

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
Cc: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Pavan Deolasee <pavan(dot)deolasee(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Block level parallel vacuum WIP
Date: 2017-01-10 06:46:41
Message-ID: CAA4eK1KP+M-AD_VddzB2_AU5isZajfwkD8BFFKBd6V7-YY28Bw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Jan 9, 2017 at 2:18 PM, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
> On Sat, Jan 7, 2017 at 2:47 PM, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
>> On Fri, Jan 6, 2017 at 11:08 PM, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
>>> On Mon, Oct 3, 2016 at 11:00 AM, Michael Paquier
>>> <michael(dot)paquier(at)gmail(dot)com> wrote:
>>>> On Fri, Sep 16, 2016 at 6:56 PM, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
>>>>> Yeah, I don't have a good solution for this problem so far.
>>>>> We might need to improve group locking mechanism for the updating
>>>>> operation or came up with another approach to resolve this problem.
>>>>> For example, one possible idea is that the launcher process allocates
>>>>> vm and fsm enough in advance in order to avoid extending fork relation
>>>>> by parallel workers, but it's not resolve fundamental problem.
>>>>
>>>
>>> I got some advices at PGConf.ASIA 2016 and started to work on this again.
>>>
>>> The most big problem so far is the group locking. As I mentioned
>>> before, parallel vacuum worker could try to extend the same visibility
>>> map page at the same time. So we need to make group locking conflict
>>> in some cases, or need to eliminate the necessity of acquiring
>>> extension lock. Attached 000 patch uses former idea, which makes the
>>> group locking conflict between parallel workers when parallel worker
>>> tries to acquire extension lock on same page.
>>>
>>
>> How are planning to ensure the same in deadlock detector? Currently,
>> deadlock detector considers members from same lock group as
>> non-blocking. If you think we don't need to make any changes in
>> deadlock detector, then explain why so?
>>
>
> Thank you for comment.
> I had not considered necessity of dead lock detection support. But
> because lazy_scan_heap actquires the relation extension lock and
> release it before acquiring another extension lock, I guess we don't
> need that changes for parallel lazy vacuum. Thought?
>

Okay, but it is quite possible that lazy_scan_heap is not able to
acquire the required lock as that is already acquired by another
process (which is not part of group performing Vacuum), then all the
processes in a group might need to run deadlock detector code wherein
multiple places, it has been assumed that group members won't
conflict. As an example, refer code in TopoSort where it is trying to
emit all groupmates together and IIRC, the basis of that part of the
code is groupmates won't conflict with each other and this patch will
break that assumption. I have not looked into the parallel vacuum
patch, but changes in 000_make_group_locking_conflict_extend_lock_v2
doesn't appear to be safe. Even if your parallel vacuum patch doesn't
need any change in deadlock detector, then also as proposed it appears
that changes in locking will behave same for any of the operations
performing relation extension. So in future any parallel operation
(say parallel copy/insert) which involves relation extension lock will
behave similary. Is that okay or are you assuming that the next
person developing any such feature should rethink about this problem
and extends your solution to match his requirement.

> What do we actually gain from having the other parts of VACUUM execute
> in parallel? Does truncation happen faster in parallel?
>

I think all CPU intensive operations for heap (like checking of
dead/live rows, processing of dead tuples, etc.) can be faster.

> Can you give us some timings for performance of the different phases,
> with varying levels of parallelism?

I feel timings depend on the kind of test we perform, for example if
there are many dead rows in heap and there are few indexes on a table,
we might see that the gain for doing parallel heap scan is
substantial.

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2017-01-10 06:52:42 Re: DROP FUNCTION of multiple functions
Previous Message Fabien COELHO 2017-01-10 06:31:01 Re: proposal: session server side variables