Re: Partial index creation always scans the entire table

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Laurenz Albe <laurenz(dot)albe(at)cybertec(dot)at>
Cc: MingJu Wu <mingjuwu0505(at)gmail(dot)com>, pgsql-performance(at)lists(dot)postgresql(dot)org
Subject: Re: Partial index creation always scans the entire table
Date: 2020-02-16 15:30:05
Message-ID: 19549.1581867005@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Laurenz Albe <laurenz(dot)albe(at)cybertec(dot)at> writes:
> On Sat, 2020-02-15 at 19:04 +0800, MingJu Wu wrote:
>> My scenario is that I have a table with 50M rows that are categorized into 10K categories.
>> I need to create a partial index for each category.

> You don't need a partial index per category, you need a single index that *contains* the category.

Yeah, that's an anti-pattern. Essentially, you are trying to replace the
first branching level of an index that includes the category column with
a ton of system catalog entries and planner proof logic to select one of
N indexes that don't include the category. It is *highly* unlikely that
that's going to be a win. It's going to be a huge loss if the planner
fails to make the proof you need, and even when it does, it's not really
going to be faster overall --- you've traded off run-time for planning
time, at a rather unfavorable exchange rate. Updates on the table are
going to be enormously penalized, too, because the index machinery doesn't
have any way to understand that only one of the indexes needs work.

I've seen people try to do this before. I wonder if the manual page
about partial indexes should explicitly say "don't do that".

regards, tom lane

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Justin Pryzby 2020-02-16 15:59:19 Re: Partial index creation always scans the entire table
Previous Message Laurenz Albe 2020-02-15 21:15:53 Re: Partial index creation always scans the entire table