From: | Robert Haas <robertmhaas(at)gmail(dot)com> |
---|---|
To: | Peter Geoghegan <pg(at)bowt(dot)ie> |
Cc: | Stephen Frost <sfrost(at)snowman(dot)net>, Andres Freund <andres(at)anarazel(dot)de>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Thoughts on nbtree with logical/varwidth table identifiers, v12 on-disk representation |
Date: | 2019-04-24 12:22:04 |
Message-ID: | CA+Tgmobm8GEvh+R0HX9xQWNk5ULczkvrMVDPw6cV_RdqREu1mg@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Mon, Apr 22, 2019 at 1:16 PM Peter Geoghegan <pg(at)bowt(dot)ie> wrote:
> Yes, though that should probably work by reusing what we already do
> with heap TID (use standard IndexTuple fields on the leaf level for
> heap TID), plus an additional identifier for the partition number that
> is located at the physical end of the tuple. IOW, I think that this
> might benefit from a design that is half way between what we already
> do with heap TIDs and what we would be required to do to make varwidth
> logical row identifiers in tables work -- the partition number is
> varwidth, though often only a single byte.
I think we're likely to have a problem with global indexes + DETACH
PARTITION that is similar to the problem we now have with DROP COLUMN.
If you drop or detach a partition, you can either (a) perform, as part
of that operation, a scan of every global index to remove all
references to the former partition, or (b) tell each global indexes
that all references to that partition number ought to be regarded as
dead index tuples. (b) makes detaching partitions faster and (a)
seems hard to make rollback-safe, so I'm guessing we'll end up with
(b).
But that means that if someone repeatedly attaches and detaches
partitions, the partition numbers could get quite big. And even
without that somebody could have a lot of partitions. So while I do
not disagree that the partition number could be variable-width and
sometimes only 1 payload byte, I think we had better make sure to
design the system in such a way that it scales to at least 4 payload
bytes, because I have no faith that anything less will be sufficient
for our demanding user base.
We don't want people to be able to exhaust the supply of partition
numbers the way they can exhaust the supply of attribute numbers by
adding and dropping columns repeatedly.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
From | Date | Subject | |
---|---|---|---|
Next Message | David Rowley | 2019-04-24 12:43:31 | Re: Optimizer items in the release notes |
Previous Message | Robert Haas | 2019-04-24 12:04:26 | Re: Pluggable Storage - Andres's take |