From: | Takahiro Itagaki <itagaki(dot)takahiro(at)oss(dot)ntt(dot)co(dot)jp> |
---|---|
To: | Robert Haas <robertmhaas(at)gmail(dot)com> |
Cc: | Simon Riggs <simon(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Jeff Davis <pgsql(at)j-davis(dot)com>, Alvaro Herrera <alvherre(at)commandprompt(dot)com>, pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: New VACUUM FULL |
Date: | 2010-01-04 02:50:56 |
Message-ID: | 20100104115056.98C2.52131E4D@oss.ntt.co.jp |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> So, what is the roadmap for getting this done? It seems like to get
> rid of VFI completely, we would need to implement something like what
> Tom described here:
>
> http://archives.postgresql.org/pgsql-hackers/2009-09/msg00249.php
>
> I'm not sure whether the current patch is a good intermediate step
> towards that ultimate goal, or whether events have overtaken it.
I think the most desirable roadmap is:
1. Enable CLUSTER to non-critical system catalogs.
2. Also enable CLUSTER and REINDEX to critical system catalogs.
3. Remove VFI and re-implement VACUUM FULL with CLUSTER-based approach.
It should be also optimized as Simon's suggestion.
My patch was intended to do 3, but we should not skip 1 and 2. In the roadmap,
we don't have two versions of VACUUM FULL (INPLACE and REWRITE) at a time.
I think we can do 1 immediately. The comment in cluster says "might work",
and I also think so. CLUSTERable toast tables are obviously useful.
/*
* Disallow clustering system relations. This will definitely NOT work
* for shared relations (we have no way to update pg_class rows in other
* databases), nor for nailed-in-cache relations (the relfilenode values
* for those are hardwired, see relcache.c). It might work for other
* system relations, but I ain't gonna risk it.
*/
For 2, we need some kinds of "relfilenode mapper" for shared relations
and critical local tables (pg_class, pg_attribute, pg_proc, and pg_type).
I'm thinking that we only store "virtual" relfilenodes for them in pg_class
and remember the actual relfilenodes in shared memory. For example,
smgropen(1248:pg_database) is redirected to smgropen(mapper[1248]).
Since we cannot touch pg_class in non-login databases, we need to avoid
updating pg_class when we assign new relfilenodes for shared relations.
We also need to store the nodes in additional flat file. There might be
another approach to store them in control file for shared relation
(ControlFileData.shared_relfilenode_mapper as Oid[]), or pg_database
for local tables (pg_database.datclsssnode, datprocnode etc.)
What approach would be better?
Regards,
---
Takahiro Itagaki
NTT Open Source Software Center
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2010-01-04 02:56:09 | Re: Thoughts on statistics for continuously advancing columns |
Previous Message | Tom Lane | 2010-01-04 02:44:40 | pgsql: When estimating the selectivity of an inequality "column > |