Re: : PostgreSQL Index behavior

From: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
To: Venkat Balaji <venkat(dot)balaji(at)verse(dot)in>
Cc: PGSQL Performance <pgsql-performance(at)postgresql(dot)org>
Subject: Re: : PostgreSQL Index behavior
Date: 2012-09-10 15:36:16
Message-ID: CAMkU=1wziuZJMD8Aat6+rwMtWdg25qCb75Len6LaFrSJ-SKnqg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On Mon, Sep 10, 2012 at 5:39 AM, Venkat Balaji <venkat(dot)balaji(at)verse(dot)in> wrote:
> Hello Community,
>
> I intend to understand further on PostgreSQL Index behavior on a "SELECT"
> statement.
>
> We have a situation where-in Index on unique column is not being picked up
> as expected when used with-in the WHERE clause with other non-unique columns
> using AND operator.
>
> explain SELECT tv.short_code, tv.chn as pkg_subscription_chn,
> tv.vert as pkg_vert, ubs.campaign_id as campaign,
> 'none'::varchar as referer,
> CAST('CAMPAIGNWISE_SUBSCRIBER_BASE' AS VARCHAR) as vn,
> count(tv.msisdn) as n_count, '0'::numeric AS tot_revenue
> FROM campaign_base ubs
> JOIN tab_current_day_v2 tv
> ON ubs.ubs_seq_id = tv.ubs_seq_id
> AND tv.dt = CAST('2012-09-08' AS DATE)
> GROUP BY tv.short_code, tv.vert, tv.chn, ubs.campaign_id, vn;
...
>
> The above plan shows "seq scan" on tab_current_day_v2 table, though there is
> an index on "ubs_seq_id" column which is an unique column.
>
> Can anyone please help us understand, why PostgreSQL optimizer is not
> prioritizing the unique column and hitting ubs_seq_id_idx Index here ?

The query where clause does not specify a constant value for
ubs_seq_id. So it is likely that the only way to use that index would
be to reverse the order of the nested loop and seq scan the other
table. Is there any reason to think that doing that would be faster?

>
> Later -
>
> We have created composite Index on "dt" (one distinct value) and
> "ubs_seq_id" (no duplicate values) and the index has been picked up.

Postgres seems to think that "dt" has no duplicate values, the
opposite of having one distinct value.
That is based on the estimates given in the explain plan, that teh seq
scan will return only one row after the filter on Filter: "(dt =
'2012-09-08'::date)". This does seem to conflict with what you
report from pg_stats, but I'm not familiar with that view, and you
haven't told us what version of pgsql you are using.

> Below is the scenario where-in the same query's plan picking up the
> composite Index.

It is only using the first column of that composite index. So if you
built a single column index just on dt, it would be picked up as well.

Cheers,

Jeff

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Jesper Krogh 2012-09-10 18:18:38 Re: Planner selects different execution plans depending on limit
Previous Message Willy-Bas Loos 2012-09-10 14:40:28 force defaults