From: | Robert Haas <robertmhaas(at)gmail(dot)com> |
---|---|
To: | Alvaro Herrera <alvherre(at)2ndquadrant(dot)com> |
Cc: | Andres Freund <andres(at)anarazel(dot)de>, David Rowley <david(dot)rowley(at)2ndquadrant(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Simon Riggs <simon(at)2ndquadrant(dot)com> |
Subject: | Re: ATTACH/DETACH PARTITION CONCURRENTLY |
Date: | 2018-11-07 18:37:10 |
Message-ID: | CA+TgmoY1DKi+KRnKouP==dBdOPQFWotVzwFk1VH8McXKGe8B7g@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Wed, Nov 7, 2018 at 12:58 PM Alvaro Herrera <alvherre(at)2ndquadrant(dot)com> wrote:
> On 2018-Nov-06, Robert Haas wrote:
> > - get_partition_dispatch_recurse and ExecCreatePartitionPruneState
> > both call RelationGetPartitionDesc.
>
> My patch deals with this by caching descriptors in the active snapshot.
> So those two things would get the same partition descriptor. There's no
> RelationGetPartitionDesc anymore, and SnapshotGetPartitionDesc takes its
> place.
>
> (I tried to use different scoping than the active snapshot; I first
> tried the Portal, then I tried the resource owner. But nothing seems to
> fit as precisely as the active snapshot.)
...
> In other words, I already solved these problems you list.
>
> Maybe you could give my patch a look.
I have, a bit. One problem I'm having is that while you explained the
design you chose in a fair amount of detail, you didn't give a lot of
explanation (that I have seen) of the reasons why you chose that
design. If there's a README or a particularly good comment someplace
that I should be reading to understand that better, please point me in
the right direction.
And also, I just don't really understand what all the problems are
yet. I'm only starting to study this.
I am a bit skeptical of your approach, though. Tying it to the active
snapshot seems like an awfully big hammer. Snapshot manipulation can
be a performance bottleneck both in terms of actual performance and
also in terms of code complexity, and I don't really like the idea of
adding more code there. It's not a sustainable pattern for making DDL
work concurrently, either -- I'm pretty sure we don't want to add new
code to things like GetLatestSnapshot() every time we want to make a
new kind of DDL concurrent. Also, while hash table lookups are pretty
cheap, they're not free. In my opinion, to the extent that we can, it
would be better to refactor things to avoid duplicate lookups of the
PartitionDesc rather than to install a new subsystem that tries to
make sure they always return the same answer.
Such an approach seems to have other possible advantages. For
example, if a COPY is running and a new partition shows up, we might
actually want to allow tuples to be routed to it. Maybe that's too
pie in the sky, but if we want to preserve the option to do such
things in the future, a hard-and-fast rule that the apparent partition
descriptor doesn't change unless the snapshot changes seems like it
might get in the way. It seems better to me to have a system where
code that accesses the relcache has a choice, so that at its option it
can either hang on to the PartitionDesc it has or get a new one that
may be different. If we can do things that way, it gives us the most
flexibility.
After the poking around I've done over the last 24 hours, I do see
that there are some non-trivial problems with making it that way, but
I'm not really ready to give up yet.
Does that make sense to you, or am I all wet here?
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
From | Date | Subject | |
---|---|---|---|
Next Message | Robert Haas | 2018-11-07 18:41:25 | Re: partitioned indexes and tablespaces |
Previous Message | Dmitry Dolgov | 2018-11-07 18:36:15 | Re: [HACKERS] [PATCH] Generic type subscripting |