Re: [HACKERS] Block level parallel vacuum

From: Dilip Kumar <dilipbalaut(at)gmail(dot)com>
To: Masahiko Sawada <masahiko(dot)sawada(at)2ndquadrant(dot)com>
Cc: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Mahendra Singh <mahi6run(at)gmail(dot)com>, Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>, Robert Haas <robertmhaas(at)gmail(dot)com>, Haribabu Kommi <kommi(dot)haribabu(at)gmail(dot)com>, Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, Amit Langote <Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp>, David Steele <david(at)pgmasters(dot)net>, Claudio Freire <klaussfreire(at)gmail(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Pavan Deolasee <pavan(dot)deolasee(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [HACKERS] Block level parallel vacuum
Date: 2019-11-04 08:26:07
Message-ID: CAFiTN-uBPj67quhS-gKSuhJVYd_tEQRs87ipjvSpxWqUX8bhLg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Nov 4, 2019 at 1:00 PM Masahiko Sawada
<masahiko(dot)sawada(at)2ndquadrant(dot)com> wrote:
>
> On Mon, 4 Nov 2019 at 14:02, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> >
> > On Fri, Nov 1, 2019 at 2:21 PM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
> > >
> > > I think that two approaches make parallel vacuum worker wait in
> > > different way: in approach(a) the vacuum delay works as if vacuum is
> > > performed by single process, on the other hand in approach(b) the
> > > vacuum delay work for each workers independently.
> > >
> > > Suppose that the total number of blocks to vacuum is 10,000 blocks,
> > > the cost per blocks is 10, the cost limit is 200 and sleep time is 5
> > > ms. In single process vacuum the total sleep time is 2,500ms (=
> > > (10,000 * 10 / 200) * 5). The approach (a) is the same, 2,500ms.
> > > Because all parallel vacuum workers use the shared balance value and a
> > > worker sleeps once the balance value exceeds the limit. In
> > > approach(b), since the cost limit is divided evenly the value of each
> > > workers is 40 (e.g. when 5 parallel degree). And suppose each workers
> > > processes blocks evenly, the total sleep time of all workers is
> > > 12,500ms (=(2,000 * 10 / 40) * 5 * 5). I think that's why we can
> > > compute the sleep time of approach(b) by dividing the total value by
> > > the number of parallel workers.
> > >
> > > IOW the approach(b) makes parallel vacuum delay much more than normal
> > > vacuum and parallel vacuum with approach(a) even with the same
> > > settings. Which behaviors do we expect?
> > >
> >
> > Yeah, this is an important thing to decide. I don't think that the
> > conclusion you are drawing is correct because it that is true then the
> > same applies to the current autovacuum work division where we divide
> > the cost_limit among workers but the cost_delay is same (see
> > autovac_balance_cost). Basically, if we consider the delay time of
> > each worker independently, then it would appear that a parallel vacuum
> > delay with approach (b) is more, but that is true only if the workers
> > run serially which is not true.
> >
> > > I thought the vacuum delay for
> > > parallel vacuum should work as if it's a single process vacuum as we
> > > did for memory usage. I might be missing something. If we prefer
> > > approach(b) I should change the patch so that the leader process
> > > divides the cost limit evenly.
> > >
> >
> > I am also not completely sure which approach is better but I slightly
> > lean towards approach (b).
>
> Can we get the same sleep time as approach (b) if we divide the cost
> limit by the number of workers and have the shared cost balance (i.e.
> approach (a) with dividing the cost limit)? Currently the approach (b)
> seems better but I'm concerned that it might unnecessarily delay
> vacuum if some indexes are very small or bulk-deletions of indexes
> does almost nothing such as brin.

Are you worried that some of the workers might not have much I/O to do
but still we divide the cost limit equally? If that is the case then
that is the case with the auto vacuum workers also right?

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Masahiko Sawada 2019-11-04 08:40:28 Re: [HACKERS] Block level parallel vacuum
Previous Message Masahiko Sawada 2019-11-04 08:21:15 Re: cost based vacuum (parallel)