Re: extract(year from date) doesn't use index but maybe could?

From: Adam Tauno Williams <awilliam(at)whitemice(dot)org>
To: pgsql-performance(at)postgresql(dot)org
Subject: Re: extract(year from date) doesn't use index but maybe could?
Date: 2015-04-19 21:03:47
Message-ID: 1429477427.11066.4.camel@whitemice.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On Sun, 2015-04-19 at 13:10 -0700, Jon Dufresne wrote:
> On Sun, Apr 19, 2015 at 10:42 AM, Tomas Vondra
> <tomas(dot)vondra(at)2ndquadrant(dot)com> wrote:
> > On 04/19/15 19:16, Jon Dufresne wrote:
> >> Given the table:
> >> CREATE TABLE dates (id SERIAL, d DATE NOT NULL, t TEXT NOT NULL)
> >> With an *index* on field d. The following two queries are functionally
> >> equivalent:
> >> 1. SELECT * FROM dates WHERE d >= '1900-01-01'
> >> 2. SELECT * FROM dates WHERE EXTRACT(year from d) >= 1900'
> >> By functionally equivalent, they will return the same result set.
> >> Query 2 does not use the index, adding a performance cost. It seems
> >> there is an opportunity for optimization to handle these two queries
> >> equivalently to take advantage of the index.
> > Or you might try creating an expression index ..
> > CREATE INDEX date_year_idx ON dates((extract(year from d)));
> Certainly, but won't this add additional overhead in the form of two
> indexes; one for the column and one for the expression?
> My point is, why force the user to take these extra steps or add
> overhead when the the two queries (or two indexes) are functionally
> equivalent.

But they aren't functionally equivalent. One is an index on a
datetime/date, the other is an index just on the year [a DOUBLE].
Date/datetimes potentially have time zones, integer values do not - in
general time values are an order of magnitude more complicated than
people expect.

> Shouldn't this is an optimization handled by the database
> so the user doesn't need to hand optimize these differences?

Sometimes "d >= '1900-01-01'" and "EXTRACT(year from d) >= 1900" may be
equivalent; but not always.

--
Adam Tauno Williams <mailto:awilliam(at)whitemice(dot)org> GPG D95ED383
Systems Administrator, Python Developer, LPI / NCLA

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message Tomas Vondra 2015-04-19 21:12:28 Re: extract(year from date) doesn't use index but maybe could?
Previous Message Jon Dufresne 2015-04-19 20:10:42 Re: extract(year from date) doesn't use index but maybe could?