From: | Robert Haas <robertmhaas(at)gmail(dot)com> |
---|---|
To: | Simon Riggs <simon(at)2ndquadrant(dot)com> |
Cc: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: group locking: incomplete patch, just for discussion |
Date: | 2014-10-28 23:24:06 |
Message-ID: | CA+TgmoY9F0tdA39hrKZT0J2K2hn0Jut7fGrFXmf3m8A3QY_auw@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Tue, Oct 28, 2014 at 4:48 PM, Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:
> On 16 October 2014 16:22, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>>> Might I gently enquire what the "something usable" we are going to see
>>> in this release? I'm not up on current plans.
>>
>> I don't know how far I'm going to get for this release yet. I think
>> pg_background is a pretty good milestone, and useful in its own right.
>> I would like to get something that's truly parallel working sooner
>> rather than later, but this group locking issue is one of 2 or 3
>> significant hurdles that I need to climb over first.
>
> pg_background is very cute, but really its not really a step forward,
> or at least very far. It's sounding like you've already decided that
> is as far as we're going to get this release, which I'm disappointed
> about.
>
> Given your description of pg_background it looks an awful lot like
> infrastructure to make Autonomous Transactions work, but it doesn't
> even do that. I guess it could do in a very small additional patch, so
> maybe it is useful for something.
>
> You asked for my help, but I'd like to see some concrete steps towards
> an interim feature so I can see some benefit in a clear direction.
>
> Can we please have the first step we discussed? Parallel CREATE INDEX?
> (Note the please)
What I've been thinking about trying to work towards is parallel
sequential scan. I think that it would actually be pretty easy to
code up a mostly-working version using the existing infrastructure,
but the patch would be rejected with a bazooka, because the
non-working parts would include things like:
1. The cooperating backends might not all be using the same snapshot,
because that requires sharing the snapshot, combo CID hash, and
transaction state.
2. The quals that got pushed down to the workers might not return the
same answers that they would have produced with a single backend,
because we have no mechanism for assessing pushdown-safety.
3. Deadlock detection would be to some greater or lesser degree
broken, the details depending on the implementation choices you made.
There is a bit of a chicken-and-egg problem here. If I submit a patch
for parallel sequential scan, it'll (justifiably) get rejected because
it doesn't solve those problems. So I'm trying to solve those
above-enumerated problems first, with working and at least
somewhat-useful examples that show how the incremental bits of
infrastructure can be used to do stuff. But that leads to your
(understandable) complaint that this isn't very real yet.
Why am I now thinking about parallel sequential scan instead of
parallel CREATE INDEX? You may remember that I posted a patch for a
new memory allocator some time ago, and it came in for a fair amount
of criticism and not much approbation. Some of that criticism was
certainly justified, and perhaps I was as hard on myself as anyone
else was. However you want to look at it, I see the trade-off between
parallel sort and parallel seq-scan this way: parallel seq-scan
requires dealing with the planner (ouch!) but parallel sort requires
dealing with memory allocation in dynamic shared memory segments
(ouch!). Both of them require solving the three problems listed
above.
And maybe a few others, but I think those are the big ones - and I
think proper deadlock detection is the hardest of them. A colleague
of mine has drafted patches for sharing snapshots and combo CIDs
between processes, and as you might expect that's pretty easy.
Sharing the transaction state (so we can test whether a transaction ID
is "our" transaction ID inside the worker) is a bit trickier, but I
think not too hard. Assessing pushdown-safety will probably boil down
to adding some equivalent of proisparallel. Maybe not the most
elegant, but defensible, and if you're looking for the shortest path
to something usable, that's probably it. But deadlock detection ...
well, I don't see any simpler solution than what I'm trying to build
here.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
From | Date | Subject | |
---|---|---|---|
Next Message | Andres Freund | 2014-10-28 23:25:35 | Re: WIP: Access method extendability |
Previous Message | Jim Nasby | 2014-10-28 23:22:00 | Re: group locking: incomplete patch, just for discussion |