Re: Clustered index to preserve data locality in a multitenant application?

From: Igor Neyman <ineyman(at)perceptron(dot)com>
To: Nicolas Grilly <nicolas(at)gardentechno(dot)com>, Kenneth Marshall <ktm(at)rice(dot)edu>
Cc: Vick Khera <vivek(at)khera(dot)org>, pgsql-general <pgsql-general(at)postgresql(dot)org>
Subject: Re: Clustered index to preserve data locality in a multitenant application?
Date: 2016-09-01 13:08:24
Message-ID: MWHPR07MB2877984B8E63B20E024FE031DAE20@MWHPR07MB2877.namprd07.prod.outlook.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

From: pgsql-general-owner(at)postgresql(dot)org [mailto:pgsql-general-owner(at)postgresql(dot)org] On Behalf Of Nicolas Grilly
Sent: Wednesday, August 31, 2016 6:32 PM
To: Kenneth Marshall <ktm(at)rice(dot)edu>
Cc: Vick Khera <vivek(at)khera(dot)org>; pgsql-general <pgsql-general(at)postgresql(dot)org>
Subject: Re: [GENERAL] Clustered index to preserve data locality in a multitenant application?

On Tue, Aug 30, 2016 at 8:17 PM, Kenneth Marshall <ktm(at)rice(dot)edu<mailto:ktm(at)rice(dot)edu>> wrote:
We have been using the extension pg_repack to keep a table groomed into
cluster order. With an appropriate FILLFACTOR to keep updates on the same
page, it works well. The issue is that it needs space to rebuild the new
index/table. If you have that, it works well.

In DB2, it seems possible to define a "clustering index" that determines how rows are physically ordered in the "table space" (the heap).

The documentation says: "When a table has a clustering index, an INSERT statement causes DB2 to insert the records as nearly as possible in the order of their index values."

It looks like a kind of "continuous CLUSTER/pg_repack". Is there something similar available or planned for PostgreSQL?

Don’t know about plans to implement clustered indexes in PostgreSQL.

Not sure if this was mentioned, MS SQL Server has clustered indexes, where heap row is just stored on the leaf level of the index.
Oracle also has similar feature: IOT, Index Organized Table.

It seems to me (may be I’m wrong), that in PostgreSQL it should be much harder to implement clustered index (with the heap row stored in the index leaf) because of the way how MVCC implemented: multiple row versions are stored in the table itself (e.g. Oracle for that purpose keeps table “clean” and stores multiple row versions in UNDO tablespace/segment).

Regards,
Igor Neyman

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message rob stone 2016-09-01 13:35:17 Re: COL unique (CustomerID) plus COL unique (COUNT) inside CustomerID
Previous Message Mike Sofen 2016-09-01 12:20:11 Re: UPDATE OR REPLACE?