Re: Parallel Sort

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Noah Misch <noah(at)leadboat(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Parallel Sort
Date: 2013-05-13 16:10:04
Message-ID: CA+TgmobpyB8F93cruYtONVVd9=VSZZ=8HbFV4ThSeoJiR6NJ5w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, May 13, 2013 at 10:57 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> This approach seems to me to be likely to guarantee that the startup
> overhead for any parallel sort is so large that only fantastically
> enormous sorts will come out ahead.
>
> I think you need to think in terms of restricting the problem space
> enough so that the worker startup cost can be trimmed to something
> reasonable. One obvious suggestion is to forbid the workers from
> doing any database access of their own at all --- the parent would
> have to do any required catalog lookups for sort functions etc.
> before forking the children.
>
> I think we should also seriously think about relying on fork() and
> copy-on-write semantics to launch worker subprocesses, instead of
> explicitly copying so much state over to them. Yes, this would
> foreclose ever having parallel query on Windows, but that's okay
> with me (hm, now where did I put my asbestos longjohns ...)
>
> Both of these lines of thought suggest that the workers should *not*
> be full-fledged backends.

Eventually, PostgreSQL needs not only parallel sort, but a more
general parallel query facility. The goal here is not to design
something specific to parallel sort, but to provide a general
infrastructure for server-side parallelism. If we restrict ourselves
to a design where syscache lookups aren't possible from a worker
backend, I have trouble seeing how that's ever gonna work. That's a
very low-level facility that a lot of things rely on. Even if you
could make the shuttling of requests between master and slave
transparent to the backend code, it's cutting into the amount of
actual parallelizable stuff, and adding very significantly to the
overhead.

I don't see any reason to panic about the worker startup cost. I
don't know whether the stuff that Noah mentioned copying will take 10
microseconds or 100 milliseconds, but there are plenty of sorts that
take large numbers of seconds or minutes to happen, so there's still
plenty of opportunity for win there. By definition, the things that
you want to run in parallel are the ones that take a long time if you
don't run them in parallel. Now, of course, if we can reduce the cost
of starting new backends (e.g. by keeping them around from one
parallel operation to the next, or by starting them via fork), that
will expand the range of cases where parallelism is a win. But I
think we could win in plenty of interesting real-world cases even if
it took a full second to initialize each new worker, and surely it
won't be nearly that bad.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2013-05-13 16:20:54 Re: erroneous restore into pg_catalog schema
Previous Message Tom Lane 2013-05-13 15:53:48 Re: lock support for aarch64