From: | Masayuki Takahashi <masayuki038(at)gmail(dot)com> |
---|---|
To: | thomas(dot)munro(at)enterprisedb(dot)com |
Cc: | pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: How to estimate the shared memory size required for parallel scan? |
Date: | 2018-08-19 04:28:34 |
Message-ID: | CA+z6ocQ69eWcVqoib2sDR+A3HFWwqerbBWwUe0sRieoFE+c=FA@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
(Sorry, once I sent to Thomas only. This is re-post.)
Hi Thomas,
Thanks you for excellent explaining about shared memory in parallel
scan and 'foreign path'.
Those are points that I want to know. thanks.
> If you just supply an IsForeignScanParallelSafe function that returns
> true, that would allow your FDW to be used inside parallel workers and
> wouldn't need any extra shared memory, but it wouldn't be a "parallel
> scan". It would just be "parallel safe". Each process that does a
> scan of your FDW would expect a full normal scan (presumably returning
> the same tuples in each process).
I think that parallel scan mechanism uses this each worker's full
normal scan to partitioned records, right?
For example, I turned IsForeignScanParallelSafe to true in cstore_fdw
and compared partitioned/non-partitioned scan.
https://gist.github.com/masayuki038/daa63a21f8c16ffa8138b50db9129ced
This shows that counted by each partition and 'Gather Merge' merge results.
As a result, parallel scan and aggregation shows the correct count.
Then, in the case of cstore_fdw, it may not be necessary to reserve
the shared memory in EstimateDSMForeignScan.
> So I guess this hasn't been done before and would require some more
> research.
I agree. I will try some query patterns.
thanks.
2018年8月18日(土) 23:08 Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>:
>
> On Sun, Aug 19, 2018 at 1:40 AM, Thomas Munro
> <thomas(dot)munro(at)enterprisedb(dot)com> wrote:
> > A true parallel scan of an FDW would be one where each process emits
> > an arbitrary fraction of the tuples, but together they emit all of the
> > tuples. You'd almost certainly need to use some shared memory to
> > coordinate that. To say that you support that, I think your
> > GetForeignPaths() function would need to call add_partial_path(). And
> > unless I'm mistaken, whether or not InitializeDSMForeignScan etc are
> > called might be the only indication you get of whether you need to run
> > in parallel-aware mode. I haven't personally heard of any FDWs that
> > can do this yet, but I just tried hacking file_fdw to register a
> > partial path and it seems to work (though of course the results are
> > duplicated because the emitted tuples are not actually partial).
>
> ... though I just noticed that my quick test used "Single Copy" mode.
> I think I see why: it looks like core's create_foreignscan_path()
> function might need to take num_workers and set parallel_aware if > 0.
> So I guess this hasn't been done before and would require some more
> research.
>
> --
> Thomas Munro
> http://www.enterprisedb.com
--
高橋 真之
From | Date | Subject | |
---|---|---|---|
Next Message | Nico Williams | 2018-08-19 04:50:50 | Re: Allowing printf("%m") only where it actually works |
Previous Message | Alvaro Herrera | 2018-08-19 03:59:19 | Re: Fix for REFRESH MATERIALIZED VIEW ownership error message |