From: | Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> |
---|---|
To: | Robert Haas <robertmhaas(at)gmail(dot)com> |
Cc: | Fabrízio Mello <fabriziomello(at)gmail(dot)com>, Thom Brown <thom(at)linux(dot)com>, Stephen Frost <sfrost(at)snowman(dot)net>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Parallel Seq Scan |
Date: | 2015-01-11 04:14:47 |
Message-ID: | CAA4eK1JiPPwaXF3XrSXuTdfzcVEForCKrRo6jnPriFLU8rROJQ@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Sun, Jan 11, 2015 at 9:09 AM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>
> On Thu, Jan 8, 2015 at 6:42 AM, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
wrote:
> > 2. To enable two types of shared memory queue's (error queue and
> > tuple queue), we need to ensure that we switch to appropriate queue
> > during communication of various messages from parallel worker
> > to master backend. There are two ways to do it
> > a. Save the information about error queue during startup of parallel
> > worker (ParallelMain()) and then during error, set the same
(switch
> > to error queue in errstart() and switch back to tuple queue in
> > errfinish() and errstart() in case errstart() doesn't need to
> > propagate
> > error).
> > b. Do something similar as (a) for tuple queue in printtup or other
> > place
> > if any for non-error messages.
> > I think approach (a) is slightly better as compare to approach (b) as
> > we need to switch many times for tuple queue (for each tuple) and
> > there could be multiple places where we need to do the same. For now,
> > I have used approach (a) in Patch which needs some more work if we
> > agree on the same.
>
> I don't think you should be "switching" queues. The tuples should be
> sent to the tuple queue, and errors and notices to the error queue.
>
To achieve what you said (The tuples should be sent to the tuple
queue, and errors and notices to the error queue.), we need to
switch the queues.
The difficulty here is that once we set the queue (using
pq_redirect_to_shm_mq()) through which the communication has to
happen, it will use the same unless we change again the queue
using pq_redirect_to_shm_mq(). For example, assume we have
initially set error queue (using pq_redirect_to_shm_mq()) then to
send tuples, we need to call pq_redirect_to_shm_mq() to
set the tuple queue as the queue that needs to be used for communication
and again if error happens then we need to do the same for error
queue.
Do you have any other idea to achieve the same?
> > 3. As per current implementation of Parallel_seqscan, it needs to use
> > some information from parallel.c which was not exposed, so I have
> > exposed the same by moving it to parallel.h. Information that is
required
> > is as follows:
> > ParallelWorkerNumber, FixedParallelState and shm keys -
> > This is used to decide the blocks that needs to be scanned.
> > We might change it in future the way parallel scan/work distribution
> > is done, but I don't see any harm in exposing this information.
>
> Hmm. I can see why ParallelWorkerNumber might need to be exposed, but
> the other stuff seems like it shouldn't be.
>
It depends upon how we decide to achieve the scan of blocks
by backend worker. In current form, the patch needs to know
if myworker is the last worker (and I have used workers_expected
to achieve the same, I know that is not the right thing but I need
something similar if we decide to do in the way I have proposed),
so that it can scan all the remaining blocks.
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com
From | Date | Subject | |
---|---|---|---|
Next Message | Peter Geoghegan | 2015-01-11 04:32:47 | INSERT ... ON CONFLICT {UPDATE | IGNORE} 2.0 |
Previous Message | Andreas Karlsson | 2015-01-11 04:07:13 | Re: Using 128-bit integers for sum, avg and statistics aggregates |