From: | Robert Haas <robertmhaas(at)gmail(dot)com> |
---|---|
To: | Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> |
Cc: | Fabrízio Mello <fabriziomello(at)gmail(dot)com>, Thom Brown <thom(at)linux(dot)com>, Stephen Frost <sfrost(at)snowman(dot)net>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Parallel Seq Scan |
Date: | 2015-01-05 15:01:07 |
Message-ID: | CA+TgmoZk+z64-ekff_wncJ0R=7dB_5jN3sMy=0vgnd6mnVaPRQ@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Fri, Jan 2, 2015 at 5:36 AM, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> On Thu, Jan 1, 2015 at 11:29 PM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>> On Thu, Jan 1, 2015 at 12:00 PM, Fabrízio de Royes Mello
>> <fabriziomello(at)gmail(dot)com> wrote:
>> > Can we check the number of free bgworkers slots to set the max workers?
>>
>> The real solution here is that this patch can't throw an error if it's
>> unable to obtain the desired number of background workers. It needs
>> to be able to smoothly degrade to a smaller number of background
>> workers, or none at all.
>
> I think handling this way can have one side effect which is that if
> we degrade to smaller number, then the cost of plan (which was
> decided by optimizer based on number of parallel workers) could
> be more than non-parallel scan.
> Ideally before finalizing the parallel plan we should reserve the
> bgworkers required to execute that plan, but I think as of now
> we can workout a solution without it.
I don't think this is very practical. When cached plans are in use,
we can have a bunch of plans sitting around that may or may not get
reused at some point in the future, possibly far in the future. The
current situation, which I think we want to maintain, is that such
plans hold no execution-time resources (e.g. locks) and, generally,
don't interfere with other things people might want to execute on the
system. Nailing down a bunch of background workers just in case we
might want to use them in the future would be pretty unfriendly.
I think it's right to view this in the same way we view work_mem. We
plan on the assumption that an amount of memory equal to work_mem will
be available at execution time, without actually reserving it. If the
plan happens to need that amount of memory and if it actually isn't
available when needed, then performance will suck; conceivably, the
OOM killer might trigger. But it's the user's job to avoid this by
not setting work_mem too high in the first place. Whether this system
is for the best is arguable: one can certainly imagine a system where,
if there's not enough memory at execution time, we consider
alternatives like (a) replanning with a lower memory target, (b)
waiting until more memory is available, or (c) failing outright in
lieu of driving the machine into swap. But devising such a system is
complicated -- for example, replanning with a lower memory target
might be latch onto a far more expensive plan, such that we would have
been better off waiting for more memory to be available; yet trying to
waiting until more memory is available might result in waiting
forever. And that's why we don't have such a system.
We don't need to do any better here. The GUC should tell us how many
parallel workers we should anticipate being able to obtain. If other
settings on the system, or the overall system load, preclude us from
obtaining that number of parallel workers, then the query will take
longer to execute; and the plan might be sub-optimal. If that happens
frequently, the user should lower the planner GUC to a level that
reflects the resources actually likely to be available at execution
time.
By the way, another area where this kind of effect crops up is with
the presence of particular disk blocks in shared_buffers or the system
buffer cache. Right now, the planner makes no attempt to cost a scan
of a frequently-used, fully-cached relation different than a
rarely-used, probably-not-cached relation; and that sometimes leads to
bad plans. But if it did try to do that, then we'd have the same kind
of problem discussed here -- things might change between planning and
execution, or even after the beginning of execution. Also, we might
get nasty feedback effects: since the relation isn't cached, we view a
plan that would involve reading it in as very expensive, and avoid
such a plan. However, we might be better off picking the "slow" plan
anyway, because it might be that once we've read the data once it will
stay cached and run much more quickly than some plan that seems better
starting from a cold cache.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
From | Date | Subject | |
---|---|---|---|
Next Message | Atri Sharma | 2015-01-05 15:12:25 | Patch to add functionality to specify ORDER BY in CREATE FUNCTION for SRFs |
Previous Message | Alvaro Herrera | 2015-01-05 14:42:09 | Re: add modulo (%) operator to pgbench |