From: | Alvaro Herrera <alvherre(at)2ndquadrant(dot)com> |
---|---|
To: | Peter Geoghegan <pg(at)bowt(dot)ie> |
Cc: | David Rowley <david(dot)rowley(at)2ndquadrant(dot)com>, "Alex V(dot)" <in_flight(at)pclovers(dot)ru>, pgsql-general <pgsql-general(at)lists(dot)postgresql(dot)org>, tgl <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Subject: | Re: Table partition with primary key in 11.3 |
Date: | 2019-06-07 21:35:32 |
Message-ID: | 20190607213532.GA26907@alvherre.pgsql |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
On 2019-Jun-07, Peter Geoghegan wrote:
> On Fri, Jun 7, 2019 at 1:22 PM Alvaro Herrera <alvherre(at)2ndquadrant(dot)com> wrote:
> > Because you can't rely on that exclusively, and you want to reuse the
> > partition ID eventually, you still need a cleanup process that removes
> > those remaining index entries. This cleanup process is a background
> > process, so it doesn't affect latency. I think it's not a good idea to
> > add latency to clients in order to optimize a background process.
>
> Ordinarily I would agree, but we're talking about something that takes
> place at the point that we're just about to split the page, that will
> probably make the page split unnecessary when we can reclaim as few as
> one or two tuples. A page split is already a very expensive thing by
> any measure, and something that can rarely be "undone", so avoiding
> them entirely is very compelling.
Sorry, I confused your argumentation with mine. I agree that removing
entries to try and prevent a page split is worth doing.
> > This way, when a partition is dropped, we have to take the time to scan
> > all global indexes; when they've been scanned we can remove the catalog
> > entry, and at that point the partition ID becomes available to future
> > partitions.
>
> It seems worth recycling partition IDs, but it should be possible to
> delay that for a very long time if necessary. Ideally, users wouldn't
> have to bother with it when they have really huge global indexes.
I envision this happening automatically -- you drop the partition, a
persistent work item is registered, autovacuum takes care of it
whenever. The user doesn't have to do anything about it.
> You don't have to be Claude Shannon to realize that it's kind of silly
> to reserve 16 bits for the offset number component of a
> TID/ItemPointer. We need to continue to support offset numbers that go
> that high, but the implementation would optimize for the common case
> where offset numbers are less than 512 (or maybe less than 1024).
(In many actual cases offset numbers require less than 7 bits in typical
pages, even).
--
Álvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
From | Date | Subject | |
---|---|---|---|
Next Message | Peter Geoghegan | 2019-06-07 23:58:29 | Re: Table partition with primary key in 11.3 |
Previous Message | Peter Geoghegan | 2019-06-07 21:13:59 | Re: Table partition with primary key in 11.3 |