Re: TODO : Allow parallel cores to be used by vacuumdb [ WIP ]

From: Dilip kumar <dilip(dot)kumar(at)huawei(dot)com>
To: Magnus Hagander <magnus(at)hagander(dot)net>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
Cc: Jan Lentfer <Jan(dot)Lentfer(at)web(dot)de>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Sawada Masahiko <sawada(dot)mshk(at)gmail(dot)com>, Euler Taveira <euler(at)timbira(dot)com(dot)br>
Subject: Re: TODO : Allow parallel cores to be used by vacuumdb [ WIP ]
Date: 2014-07-16 12:30:57
Message-ID: 4205E661176A124FAF891E0A6BA91352663439D9@szxeml509-mbs.china.huawei.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 16 July 2014 12:13 Magnus Hagander Wrote,

>>Yeah, those are exactly my points. I think it would be significantly simpler to do it that way, rather than forking and threading. And also easier to make portable...

>>(and as a optimization on Alvaros suggestion, you can of course reuse the initial connection as one of the workers as long as you got the full list of tasks from it up front, which I think you do anyway in order to do sorting of tasks...)

Oh, I got your point, I will update my patch and send,

Now we can completely remove vac_parallel.h file and no need of refactoring also:)

Thanks & Regards,

Dilip Kumar

From: Magnus Hagander [mailto:magnus(at)hagander(dot)net]
Sent: 16 July 2014 12:13
To: Alvaro Herrera
Cc: Dilip kumar; Jan Lentfer; Tom Lane; PostgreSQL-development; Sawada Masahiko; Euler Taveira
Subject: Re: [HACKERS] TODO : Allow parallel cores to be used by vacuumdb [ WIP ]

On Jul 16, 2014 7:05 AM, "Alvaro Herrera" <alvherre(at)2ndquadrant(dot)com<mailto:alvherre(at)2ndquadrant(dot)com>> wrote:
>
> Tom Lane wrote:
> > Dilip kumar <dilip(dot)kumar(at)huawei(dot)com<mailto:dilip(dot)kumar(at)huawei(dot)com>> writes:
> > > On 15 July 2014 19:01, Magnus Hagander Wrote,
> > >> I am late to this game, but the first thing to my mind was - do we
> > >> really need the whole forking/threading thing on the client at all?
> >
> > > Thanks for the review, I understand you point, but I think if we have do this directly by independent connection,
> > > It's difficult to equally divide the jobs b/w multiple independent connections.
> >
> > That argument seems like complete nonsense. You're confusing work
> > allocation strategy with the implementation technology for the multiple
> > working threads. I see no reason why a good allocation strategy couldn't
> > work with either approach; indeed, I think it would likely be easier to
> > do some things *without* client-side physical parallelism, because that
> > makes it much simpler to handle feedback between the results of different
> > operational threads.
>
> So you would have one initial connection, which generates a task list;
> then open N libpq connections. Launch one vacuum on each, and then
> sleep on select() on the three sockets. Whenever one returns
> read-ready, the vacuuming is done and we send another item from the task
> list. Repeat until tasklist is empty. No need to fork anything.
>

Yeah, those are exactly my points. I think it would be significantly simpler to do it that way, rather than forking and threading. And also easier to make portable...

(and as a optimization on Alvaros suggestion, you can of course reuse the initial connection as one of the workers as long as you got the full list of tasks from it up front, which I think you do anyway in order to do sorting of tasks...)

/Magnus

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Rainer Tammer 2014-07-16 13:56:18 Re: Fwd: Re: Compile fails on AIX 6.1
Previous Message MauMau 2014-07-16 12:01:31 Re: [bug fix] pg_ctl always uses the same event source