Re: Creating Index

From: Stephan Szabo <sszabo(at)megazone(dot)bigpanda(dot)com>
To: CN <cnliou9(at)fastmail(dot)fm>
Cc: pgsql-sql(at)postgresql(dot)org
Subject: Re: Creating Index
Date: 2003-10-02 05:37:10
Message-ID: 20031001222050.E53419@megazone.bigpanda.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-sql

On Wed, 1 Oct 2003, CN wrote:

> > You do realize that extract returns a double precision value not an
> > integer, and it's probably not going to be willing to push clauses down
> > through the union where the types are different .
> >
>
> Argh! I didn't noticed that. Thanks for the reminder.
>
> Let's do not consider table2 and view1 for this moment and focus only on
> table1.
> Table1 in my original post was incorrect. Please forgive me! (I posted it
> midnight when my head was not clear and tried to make my case simple for
> understanding.) The correct one is:
>
> CREATE TABLE table1
> ( id VARCHAR(20) PRIMARY KEY,
> d DATE,
> amount INTEGER
> );
> CREATE INDEX itable1 ON table1 (d);
>
> EXPLAIN ANALYZE SELECT COUNT(*) FROM table1 WHERE EXTRACT(YEAR FROM d) >=
> 2001.0 AND EXTRACT(MONTH FROM d) >= 1.;
>
> takes 630 msec on my AMD 450MHz machine. While
>
> EXPLAIN ANALYZE SELECT COUNT(*) FROM table1 WHERE d >= '2001-1-1';

These two queries seem fairly equivalent, but
WHERE EXTRACT(YEAR FROM d) >=2001 AND EXTRACT(MONTH FROM d)>=2
is not equivalent to
WHERE d>='2001-2-1'

Are you trying to get certain months in a group of years or all months
after a given fixed time point? If the former, only the former form in
general will work, if the latter the former form really doesn't work at
all with the exception of the case where you're doing month>=1 (which
might as well mean you don't do a month test at all).

> Aggregate (cost=535.20..535.20 rows=1 width=0) (actual
> time=625.10..625.11 rows=1 loops=1)
> -> Seq Scan on table1 (cost=0.00..532.58 rows=1048 width=0) (actual
> time=14.84..605.85 rows=3603 loops=1)
> Filter: ((date_part('year'::text, f2) > 2001::double precision)
> AND (date_part('month'::text, f2) >=
> 1::double precision))
> Total runtime: 626.61 msec
>
> -----------------------
> Aggregate (cost=464.12..464.12 rows=1 width=0) (actual
> time=114.28..114.28 rows=1 loops=1)
> -> Seq Scan on table1 (cost=0.00..461.86 rows=902 width=0) (actual
> time=10.71..102.99 rows=3603 loops=1)
> Filter: (f2 >= '2002-01-01'::date)
> Total runtime: 114.50 msec
>
> Does the first query perform sequential scan?

They both are. I'd have to guess that most of the real cost is coming from
evaluating the conditions, which seems wierd.

> If a composit index (year,month) derived from column "d" helps and is
> available, then someone please show me how to build that index like:
>
> CREATE INDEX i1 ON table1 <EXTRACT(YEAR FROM d)::TEXT || EXTRACT(MONTH
> FROM d)::TEXT>
>
> Is creating a function that eats DATE as argument to build that index my
> only solution?

For 7.3 and earlier, yes I think so and you'd have to use that form in
the query. However that wouldn't help for the union query.

In 7.4, you can make an index on table((extract(year from d)),
(extract(month from d))) and I believe once the type issues were resolved
that would get used. However, I think the evaluations of the extracts
would mean that it'd still probably lose to a comparison on date (unless
you want the fraction of a set of years solution).

In response to

Browse pgsql-sql by date

  From Date Subject
Next Message Popeanga Marian 2003-10-02 08:52:26 output
Previous Message Tom Lane 2003-10-02 03:18:07 Re: Creating Index