From: | 曾文旌 <wenjing(dot)zwj(at)alibaba-inc(dot)com> |
---|---|
To: | Robert Haas <robertmhaas(at)gmail(dot)com> |
Cc: | Bruce Momjian <bruce(at)momjian(dot)us>, Julien Rouhaud <rjuju123(at)gmail(dot)com>, Ibrar Ahmed <ibrar(dot)ahmad(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Peter Geoghegan <pg(at)bowt(dot)ie>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Amit Langote <amitlangote09(at)gmail(dot)com>, Hamid Akhtar <hamid(dot)akhtar(at)gmail(dot)com>, "heikki(dot)linnakangas" <heikki(dot)linnakangas(at)iki(dot)fi>, "Nasby, Jim" <nasbyj(at)amazon(dot)com>, wjzeng <wjzeng2012(at)gmail(dot)com> |
Subject: | Re: Proposal: Global Index |
Date: | 2021-01-18 10:00:40 |
Message-ID: | 577C04B3-B747-4C4A-A16D-159075FF1FD9@alibaba-inc.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
> 2021年1月12日 02:37,Robert Haas <robertmhaas(at)gmail(dot)com> 写道:
>
> On Mon, Jan 11, 2021 at 12:46 PM Bruce Momjian <bruce(at)momjian(dot)us> wrote:
>>> For 1) The DETACH old child table can be finished immediately, global index can be kept valid after DETACH is completed, and the cleanup of garbage data in global index can be deferred to VACUUM.
>>> This is similar to the global index optimization done by Oracle12c.
>>> For 2) ATTACH new empty child table can also be completed immediately.
>>> If this is the case, many of the advantages of partitioned tables will be retained, while the advantages of global indexes will be gained.
>>
>> Yes, we can keep the index rows for the deleted partition and clean them
>> up later, but what is the advantage of partitioning then? Just heap
>> deletion quickly? Is that enough of a value?
>
> I actually think the idea of lazily deleting the index entries is
> pretty good, but it won't work if the way the global index is
> implemented is by adding a tableoid column. Because then, I might
> detach a partition and later reattach it and the old index entries are
> still there but the table contents might have changed. Worse yet, the
> table might be dropped and the table OID reused for a completely
> unrelated table with completely unrelated contents, which could then
> be attached as a new partition.
>
> One of the big selling points of global indexes is that they allow you
> to enforce uniqueness on a column unrelated to the partitioning
> column. Another is that you can look up a value by doing a single
> index scan on the global index rather than an index scan per
> partition. Those things are no less valuable for performing index
> deletion lazily.
>
> However, there is a VACUUM amplification effect to worry about here
> which Wenjing seems not to be considering. Suppose I have a table
> which is not partitioned and it is 1TB in size with an index that is
> 128GB in size. To vacuum the table, I need to do 1TB + 128GB of I/O.
> Now, suppose I now partition the table into 1024 partitions each with
> its own local index. Each partition is 1GB in size and the index on
> each partition is 128MB in size. To vacuum an individual partition
> requires 1GB + 128MB of I/O, so to vacuum all the partitions requires
> the same amount of total I/O as before. But, now suppose that I have a
> single global index instead of a local index per partition. First, how
> big will that index be? It will not be 128GB, but somewhat bigger,
> because it needs extra space for every indexed tuple. Let's say 140GB.
> Furthermore, it will need to be vacuumed whenever any child is
> vacuumed, because it contains some index entries from every child. So
> the total I/O to vacuum all partitions is now 1GB * 1024 + 140GB *
> 1024 = 141TB, which is a heck of a lot worse than the 1.125TB I
> required with the unpartitioned table or the locally partitioned
> table.
Thank you for pointing this out.
It seems that some optimization can be done, but there is no good way
to completely eliminate the vacuum amplification effect of the global index.
Maybe we can only count on Zheap, which doesn't need to do Vaccum.
>
> That's not necessarily a death sentence for every use case, but it's
> going to be pretty bad for tables that are big and heavily updated.
>
> --
> Robert Haas
> EDB: http://www.enterprisedb.com
From | Date | Subject | |
---|---|---|---|
Next Message | Greg Nancarrow | 2021-01-18 10:19:49 | Re: Parallel INSERT (INTO ... SELECT ...) |
Previous Message | Pavel Stehule | 2021-01-18 09:59:27 | Re: proposal: schema variables |