From: | Robert Haas <robertmhaas(at)gmail(dot)com> |
---|---|
To: | Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> |
Cc: | Fabrízio Mello <fabriziomello(at)gmail(dot)com>, Thom Brown <thom(at)linux(dot)com>, Stephen Frost <sfrost(at)snowman(dot)net>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Parallel Seq Scan |
Date: | 2015-01-11 03:39:26 |
Message-ID: | CA+TgmobBZ=0n=JcS28hBxVBaSXeZHBQCnxVzCTUSPMe1zsuGdw@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Thu, Jan 8, 2015 at 6:42 AM, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> Are we sure that in such cases we will consume work_mem during
> execution? In cases of parallel_workers we are sure to an extent
> that if we reserve the workers then we will use it during execution.
> Nonetheless, I have proceded and integrated the parallel_seq scan
> patch with v0.3 of parallel_mode patch posted by you at below link:
> http://www.postgresql.org/message-id/CA+TgmoYmp_=XcJEhvJZt9P8drBgW-pDpjHxBhZA79+M4o-CZQA@mail.gmail.com
That depends on the costing model. It makes no sense to do a parallel
sequential scan on a small relation, because the user backend can scan
the whole thing itself faster than the workers can start up. I
suspect it may also be true that the useful amount of parallelism
increases the larger the relation gets (but maybe not).
> 2. To enable two types of shared memory queue's (error queue and
> tuple queue), we need to ensure that we switch to appropriate queue
> during communication of various messages from parallel worker
> to master backend. There are two ways to do it
> a. Save the information about error queue during startup of parallel
> worker (ParallelMain()) and then during error, set the same (switch
> to error queue in errstart() and switch back to tuple queue in
> errfinish() and errstart() in case errstart() doesn't need to
> propagate
> error).
> b. Do something similar as (a) for tuple queue in printtup or other
> place
> if any for non-error messages.
> I think approach (a) is slightly better as compare to approach (b) as
> we need to switch many times for tuple queue (for each tuple) and
> there could be multiple places where we need to do the same. For now,
> I have used approach (a) in Patch which needs some more work if we
> agree on the same.
I don't think you should be "switching" queues. The tuples should be
sent to the tuple queue, and errors and notices to the error queue.
> 3. As per current implementation of Parallel_seqscan, it needs to use
> some information from parallel.c which was not exposed, so I have
> exposed the same by moving it to parallel.h. Information that is required
> is as follows:
> ParallelWorkerNumber, FixedParallelState and shm keys -
> This is used to decide the blocks that needs to be scanned.
> We might change it in future the way parallel scan/work distribution
> is done, but I don't see any harm in exposing this information.
Hmm. I can see why ParallelWorkerNumber might need to be exposed, but
the other stuff seems like it shouldn't be.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
From | Date | Subject | |
---|---|---|---|
Next Message | Robert Haas | 2015-01-11 03:40:58 | Re: Parallel Seq Scan |
Previous Message | Jim Nasby | 2015-01-11 01:40:23 | Re: Custom/Foreign-Join-APIs (Re: [v9.5] Custom Plan API) |