Re: partitioned indexes and tablespaces

From: Amit Langote <Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp>
To: Michael Paquier <michael(at)paquier(dot)xyz>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
Cc: Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: partitioned indexes and tablespaces
Date: 2018-11-02 02:46:59
Message-ID: 598f468e-39e1-250f-7584-7fc41b19ab3b@lab.ntt.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 2018/11/02 10:27, Michael Paquier wrote:
> On Thu, Nov 01, 2018 at 09:31:38PM -0300, Alvaro Herrera wrote:
>> A customer reported to us that partitioned indexes are not working
>> consistently with tablespaces:
>
> Let's see...
>
>> 1. When a CREATE INDEX specifies a tablespace, existing partitions get
>> the index in the correct tablespace; however, the parent index itself
>> does not record the tablespace. So when new partitions are created
>> later, they get the index in the default tablespace instead of the
>> specified tablespace. Fix by saving the tablespace in the pg_class row
>> for the parent index.
>
> I may be missing something of course... But partitioned tables don't
> register the tablespace they are on either so as it cannot be used by
> any partitions created on it:
> =# create tablespace popo location '/home/ioltas/data/tbspace';
> CREATE TABLESPACE
> =# create table aa (a int) partition by list (a) tablespace popo;
> CREATE TABLE
> =# create table aa_1 partition of aa for values in (1) tablespace popo;
> CREATE TABLE
> =# create table aa_2 partition of aa for values in (2);
> CREATE TABLE
> =# select t.spcname, c.relname from pg_class c, pg_tablespace t
> where c.oid > 16000 and c.reltablespace = t.oid;
> spcname | relname
> ---------+---------
> popo | aa_1
> (1 row)
>
> It seems to me that the current behavior is wanted in this case, because
> partitioned tables and partitioned indexes have no physical storage.

Keith Fiske complained about this behavior for partitioned *tables* a few
months ago, which led to the following discussion:

https://www.postgresql.org/message-id/flat/CAKJS1f9PXYcT%2Bj%3DoyL-Lquz%3DScNwpRtmD7u9svLASUygBdbN8w%40mail.gmail.com

It's Michael's message that was the last one on that thread. :)

https://www.postgresql.org/message-id/20180413224007.GB27295%40paquier.xyz

By the way, if we decide to do something about this, I think we do the
same for partitioned tables. There are more than one interesting
behaviors possible that are mentioned in the above thread for when
parent's reltablespace is set/changed. For example, when it's set, the
existing partitions are not moved, but the new value only applies to
subsequently created partitions. Considering various such behaviors, this
would seem like new feature work, which I'd suppose would only be
considered for 12.

>> 2. ALTER TABLE SET TABLESPACE, applied to the partitioned index, would
>> raise an error indicating that it's not the correct relation kind. In
>> order for this to actually work, we need bespoke code for ATExecCmd();
>> the code for all other relation kinds wants to move storage (and runs in
>> Phase 3, later), but these indexes do not have that. Therefore, write a
>> cut-down version which is invoked directly in ATExecCmd instead.
>
> Using the previous example, this does not raise an error:
> alter table aa set tablespace popo;
> However the reference to reltablespace in pg_class is not changed. So I
> would agree with your point to not raise an error and back-patch that,
> but I don't agree with the point of changing reltablespace for a
> partitioned index if that's what you mean.
>
>> 3. ALTER INDEX ALL IN TABLESPACE, identical problem, is also fixed by
>> the above change.
>
> Reproducible with just the following stuff on top of the previous
> example:
> create index aai on aa(a);
> alter index all in tablespace pg_default set tablespace popo;
>
> In this case also raising an error is a bug, it seems to me that
> partitioned indexes should just be ignored.

Same argument here too.

IOW, I agree with Michael that if something will be back-patched to 11, it
should be a small patch to make the unsupported relkind error go away.

Thanks,
Amit

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Christian Ohler 2018-11-02 02:48:59 Re: WIP Patch: Add a function that returns binary JSONB as a bytea
Previous Message Steve Singer 2018-11-02 02:43:00 Re: heap_sync seems rather oblivious to partitioned tables (wal_level=minimal)