| From: | Alexey Kondratov <a(dot)kondratov(at)postgrespro(dot)ru> | 
|---|---|
| To: | Michael Paquier <michael(at)paquier(dot)xyz> | 
| Cc: | Masahiko Sawada <masahiko(dot)sawada(at)2ndquadrant(dot)com>, Steve Singer <steve(at)ssinger(dot)info>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Alexander Korotkov <a(dot)korotkov(at)postgrespro(dot)ru>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Jose Luis Tallon <jltallon(at)adv-solutions(dot)net> | 
| Subject: | Re: Allow CLUSTER, VACUUM FULL and REINDEX to change tablespace on the fly | 
| Date: | 2019-12-02 09:41:03 | 
| Message-ID: | e625699b-bb82-a28b-370b-2da4bbdb72ed@postgrespro.ru | 
| Views: | Whole Thread | Raw Message | Download mbox | Resend email | 
| Thread: | |
| Lists: | pgsql-hackers | 
On 02.12.2019 11:21, Michael Paquier wrote:
> On Wed, Nov 27, 2019 at 08:47:06PM +0300, Alexey Kondratov wrote:
>> The only difference is that point 3) and tablespace part of 5) were missing
>> in RelationSetNewRelfilenode, so I added them, and I do 4) after 6) in
>> REINDEX. Thus, it seems that in my implementation of tablespace change in
>> REINDEX I am more sure that "the relation tablespace is correctly updated
>> before reindexing", since I do reindex after CCI (point 6), doesn't it?
>>
>> So why it is fine for ATExecSetTableSpace to do pretty much the same, but
>> not for REINDEX? Or the key point is in doing actual work before CCI, but
>> for me it seems a bit against what you have wrote?
> Nope, the order is not the same on what you do here, causing a
> duplication in the tablespace selection within
> RelationSetNewRelfilenode() and when flushing the relation on the new
> tablespace for the first time after the CCI happens, please see
> below.  And we should avoid that.
>
>> Thus, I cannot get your point correctly here. Can you, please, elaborate a
>> little bit more your concerns?
> The case of REINDEX CONCURRENTLY is pretty simple, because a new
> relation which is a copy of the old relation is created before doing
> the reindex, so you simply need to set the tablespace OID correctly
> in index_concurrently_create_copy().  And actually, I think that the
> computation is incorrect because we need to check after
> MyDatabaseTableSpace as well, no?
No, the same logic already exists in heap_create:
     if (reltablespace == MyDatabaseTableSpace)
         reltablespace = InvalidOid;
Which is called by index_concurrently_create_copy -> index_create -> 
heap_create.
> The case of REINDEX is more tricky, because you are working on a
> relation that already exists, hence I think that what you need to do a
> different thing before the actual REINDEX:
> 1) Update the existing relation's pg_class tuple to point to the new
> tablespace.
> 2) Do a CommandCounterIncrement.
> So I think that the order of the operations you are doing is incorrect,
> and that you have a risk of breaking the existing tablespace assignment
> logic done when first flushing a new relfilenode.
>
> This actually brings an extra thing: when doing a plain REINDEX you
> need to make sure that the past relfilenode of the relation gets away
> properly.  The attached POC patch does that before doing the CCI which
> is a bit ugly, but that's enough to show my point, and there is no
> need to touch RelationSetNewRelfilenode() this way.
Thank you for the detailed answer and PoC patch. I will recheck 
everything and dig deeper into this problem, and come up with something 
closer to the next 01.2020 commitfest.
Regards
-- 
Alexey Kondratov
Postgres Professional https://www.postgrespro.com
Russian Postgres Company
| From | Date | Subject | |
|---|---|---|---|
| Next Message | yuzuko | 2019-12-02 09:42:22 | Re: Autovacuum on partitioned table | 
| Previous Message | yuzuko | 2019-12-02 09:25:37 | Re: Partitioning versus autovacuum |