Re: Proposal: Global Index

From: Bruce Momjian <bruce(at)momjian(dot)us>
To: Jose Luis Tallon <jltallon(at)adv-solutions(dot)net>
Cc: Jeremy Schneider <schnjere(at)amazon(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Peter Geoghegan <pg(at)bowt(dot)ie>, Robert Haas <robertmhaas(at)gmail(dot)com>, Ibrar Ahmed <ibrar(dot)ahmad(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Amit Langote <amitlangote09(at)gmail(dot)com>, Hamid Akhtar <hamid(dot)akhtar(at)gmail(dot)com>, "heikki(dot)linnakangas" <heikki(dot)linnakangas(at)iki(dot)fi>
Subject: Re: Proposal: Global Index
Date: 2019-12-19 16:12:07
Message-ID: 20191219161207.GM30116@momjian.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Dec 19, 2019 at 09:48:40AM +0100, Jose Luis Tallon wrote:
> On 19/12/19 4:03, Bruce Momjian wrote:
> > On Mon, Nov 25, 2019 at 03:44:39PM -0800, Jeremy Schneider wrote:
> > > On 11/25/19 15:05, Jeremy Schneider wrote:
> > > > ... the cost of doing the individual index lookups across 180
> > > > partitions (and 180 indexes) was very high, so they stored max and min
> > > > txn id per partition and would generate a query with all the dates that
> > > > a txn id could have been in so that only a small number of partition
> > > > indexes would be accessed.
> > > >
> > > > .. If we are looking for higher concurrency, we can usually
> > > > add a hack/workaround that filters on a partition key to provide “pretty
> > > > good” pruning. The net result is that you get 2-3x the IO due to the
> > > > lack of global index (same workaround as first story above).
> > > Is that basically like a global BRIN index with granularity at the
> > > partition level?
> > Exactly! :-)
>
> Actually, one "kind of" BRIN index *per partitioned table* mapping (key
> range) -> (partition oid)... and so concurrency doesn't need to be very
> affected.
>
> (we don't need to do things just like other RDBMS do, ya know... ;)
>
>
> IIRC, this precise approach was suggested around 2016 when initially
> discussing the "declarative partitioning" which originated Postgres' current
> partitioning scheme, in order to optimize partition pruning.

Robert Haas identified two needs for global indexes:

https://www.postgresql.org/message-id/CA+Tgmob_J2M2+QKWrhg2NjQEkMEwZNTfd7a6Ubg34fJuZPkN2g@mail.gmail.com

One of the biggest reasons why people want it is to enforce uniqueness
for secondary keys - e.g. the employees table is partitioned by
employee ID, but SSN should also be unique, at least among employees
for whom it's not NULL.

But people also want it for faster data retrieval: if you're looking
for a commonly-occurring value, an index per partition is fine. But if
you're looking for values that occur only once or a few times across
the whole hierarchy, an index scan per partition is very costly.

I don't see lossy BRIN indexes helping with the uniqueness use-case, and
I am not sure they would help with the rare case either. They would
help for range-based partitions, but I thought our existing facilities
worked in that case.

--
Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ As you are, so once was I. As I am, so you will be. +
+ Ancient Roman grave inscription +

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Bruce Momjian 2019-12-19 16:19:59 Re: [PATCH] Remove twice assignment with var pageop (nbtree.c).
Previous Message Andres Freund 2019-12-19 16:02:48 Re: Read Uncommitted