From: | Jeff Janes <jeff(dot)janes(at)gmail(dot)com> |
---|---|
To: | Alban Hertroys <haramrae(at)gmail(dot)com> |
Cc: | "pgsql-general(at)postgresql(dot)org" <pgsql-general(at)postgresql(dot)org> |
Subject: | Re: Performance with high correlation in group by on PK |
Date: | 2017-08-28 19:32:19 |
Message-ID: | CAMkU=1z_RZVMvXpag3NuA+mZS+1BPLjrZTrODAwvLMxit41E6w@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
On Mon, Aug 28, 2017 at 5:22 AM, Alban Hertroys <haramrae(at)gmail(dot)com> wrote:
> Hi all,
>
> It's been a while since I actually got to use PG for anything serious,
> but we're finally doing some experimentation @work now to see if it is
> suitable for our datawarehouse. So far it's been doing well, but there
> is a particular type of query I run into that I expect we will
> frequently use and that's choosing a sequential scan - and I can't
> fathom why.
>
> This is on:
>
>
> The query in question is:
> select "VBAK_MANDT", max("VBAK_VBELN")
> from staging.etl00001_vbak
> group by "VBAK_MANDT";
>
> This is the header-table for another detail table, and in this case
> we're already seeing a seqscan. The thing is, there are 15M rows in
> the table (disk usage is 15GB), while the PK is on ("VBAK_MANDT",
> "VBAK_VBELN") with very few distinct values for "VBAK_MANDT" (in fact,
> we only have 1 at the moment!).
>
You need an "index skip-scan" or "loose index scan". PostgreSQL doesn't
currently detect and implement them automatically, but you can use a
recursive CTE to get it to work. There are some examples at
https://wiki.postgresql.org/wiki/Loose_indexscan
Cheers,
Jeff
From | Date | Subject | |
---|---|---|---|
Next Message | Jerry Regan | 2017-08-28 20:28:19 | Create Action for psql when NOTIFY Recieved |
Previous Message | Gabriel Furstenheim Milerud | 2017-08-28 19:04:10 | Re: Extension coverage |