Quick Links

Re: cost based vacuum (parallel)

From:	Masahiko Sawada <masahiko(dot)sawada(at)2ndquadrant(dot)com>
To:	Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc:	Dilip Kumar <dilipbalaut(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Sawada Masahiko <sawada(dot)mshk(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject:	Re: cost based vacuum (parallel)
Date:	2019-11-18 06:40:59
Message-ID:	CA+fd4k4T2udSkcDWKix1s18bKMVworsRXm0ZAujtQ7tJk0XAUg@mail.gmail.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

On Fri, 15 Nov 2019 at 11:54, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
>
> On Wed, Nov 13, 2019 at 10:02 AM Masahiko Sawada
> <masahiko(dot)sawada(at)2ndquadrant(dot)com> wrote:
> >
> > I've done some tests while changing shared buffer size, delays and
> > number of workers. The overall results has the similar tendency as the
> > result shared by Dilip and looks reasonable to me.
> >
>
> Thanks, Sawada-san for repeating the tests. I can see from yours,
> Dilip and Mahendra's testing that the delay is distributed depending
> upon the I/O done by a particular worker and the total I/O is also as
> expected in various kinds of scenarios. So, I think this is a better
> approach. Do you agree or you think we should still investigate more
> on another approach as well?
>
> I would like to summarize this approach. The basic idea for parallel
> vacuum is to allow the parallel workers and master backend to have a
> shared view of vacuum cost related parameters (mainly
> VacuumCostBalance) and allow each worker to update it and then based
> on that decide whether it needs to sleep. With this basic idea, we
> found that in some cases the throttling is not accurate as explained
> with an example in my email above [1] and then the tests performed by
> Dilip and others in the following emails (In short, the workers doing
> more I/O can be throttled less). Then as discussed in an email later
> [2], we tried a way to avoid letting the workers sleep which has done
> less or no I/O as compared to other workers. This ensured that
> workers who are doing more I/O got throttled more. The idea is to
> allow any worker to sleep only if it has performed the I/O above a
> certain threshold and the overall balance is more than the cost_limit
> set by the system. Then we will allow the worker to sleep
> proportional to the work done by it and reduce the
> VacuumSharedCostBalance by the amount which is consumed by the current
> worker. This scheme leads to the desired throttling by different
> workers based on the work done by the individual worker.
>
> We have tested this idea with various kinds of workloads like by
> varying shared buffer size, delays and number of workers. Then also,
> we have tried with a different number of indexes and workers. In all
> the tests, we found that the workers are throttled proportional to the
> I/O being done by a particular worker.

Thank you for summarizing!

I agreed to this approach.

Regards,

--
Masahiko Sawada http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

In response to

Re: cost based vacuum (parallel) at 2019-11-15 02:53:49 from Amit Kapila

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Surafel Temesgen	2019-11-18 06:42:14	Re: Conflict handling for COPY FROM
Previous Message	Pavel Stehule	2019-11-18 06:39:25	Re: dropdb --force