Re: Query planner question

From: Stephan Szabo <sszabo(at)megazone23(dot)bigpanda(dot)com>
To: Ernest E Vogelsinger <ernest(at)vogelsinger(dot)at>
Cc: <pgsql-general(at)postgresql(dot)org>
Subject: Re: Query planner question
Date: 2003-06-12 21:56:44
Message-ID: 20030612145211.Y37555-100000@megazone23.bigpanda.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general


On Thu, 12 Jun 2003, Ernest E Vogelsinger wrote:

>
> Maybe someone can shed a light on this:
>
> I have a (test) table, populated with 2M rows:
> rid int4, -- primary key
> dcid varchar,
> dsid varchar,
> drid int4,
> owid int4,
> nxid int4,
> usg int4,
> --- something more that's not important
>
> Indexes in question:
> CREATE INDEX id_owid ON table (owid,nxid,usg) USING BTREE
> CREATE INDEX id_dowid ON table (owid, dcid, dsid, drid, nxid, usg)
> USING BTREE
> CREATE INDEX id_dictid ON table (dcid, dsid, drid, nxid) USING BTREE
>
> Test Run
>
> VACUUM ANALYZE FULL;
> VACUUM
> EXPLAIN ANALYZE SELECT DISTINCT dcid,dsid,drid FROM table WHERE owid = 1;
> NOTICE: QUERY PLAN:
> Unique (cost=402.07..402.90 rows=11 width=20) (actual time=10.13..11.49
> rows=512 loops=1)
> -> Sort (cost=402.07..402.07 rows=111 width=20) (actual
> time=10.13..10.46 rows=512 loops=1)
> -> Index Scan using id_owid on table (cost=0.00..398.30 rows=111
> width=20) (actual time=0.05..4.44 rows=512 loops=1)
> Total runtime: 11.95 msec
>
> EXPLAIN ANALYZE SELECT DISTINCT dcid,dsid,drid FROM table WHERE owid is null;
> NOTICE: QUERY PLAN:
> Unique (cost=126611.73..128034.59 rows=18971 width=20) (actual
> time=27515.63..28359.88 rows=513 loops=1)
> -> Sort (cost=126611.73..126611.73 rows=189714 width=20) (actual
> time=27515.62..27792.29 rows=199131 loops=1)
> -> Seq Scan on table (cost=0.00..106885.04 rows=189714
> width=20) (actual time=18.76..16467.28 rows=199131 loops=1)
> Total runtime: 28633.68 msec
>
> SET enable_seqscan TO off;
> SET VARIABLE
> EXPLAIN SELECT DISTINCT dcid,dsid,drid FROM table WHERE owid is null;
> -- this process uses 34M RSS!!!
> NOTICE: QUERY PLAN:
> Unique (cost=0.00..7887659.31 rows=18971 width=20) (actual
> time=2.57..711940.78 rows=513 loops=1)
> -> Index Scan using id_dictid on table (cost=0.00..7886236.46
> rows=189714 width=20) (actual time=2.57..710482.07 rows=199131 loops=1)
> Total runtime: 711942.76 msec
>
> A couple of questions arise:
>
> 1) Why chooses the planner to use id_owid (and not id_dowid as it would
> seem logical) for the first query?

I think that it doesn't entirely know that owid=1, sort by dcid, dsid,
drid can be handled by that index. I think it's possible that if you
added owid to the select list it might use id_dowid instead. I think this
is similar to the issues with order by, conditions and index choice, which
you may find useful information in the archives about)

> 2) Why is NO index used for the second query, the only difference being in
> the constraint value (owid is set vs. owid is null)?

IS NULL is not considered an indexable condition currently (there are past
discussions and hackarounds in the archives)

> 3) Why does it use id_dictid for the second query when forced to, and not
> id_owid or id_dowid?

As for #2, it doesn't think it can use an index with owid in the front.

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Dmitry Tkach 2003-06-12 21:57:06 Re: Query planner question
Previous Message Jim C. Nasby 2003-06-12 21:52:50 Re: Postgres performance comments from a MySQL user