Re: index not used in joins

From: Sebastian Böck <sebastianboeck(at)freenet(dot)de>
To: Richard Huxton <dev(at)archonet(dot)com>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: index not used in joins
Date: 2004-12-13 14:18:44
Message-ID: 41BDA4C4.7050201@freenet.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Richard Huxton wrote:
> Sebastian Böck wrote:
>
>> Richard Huxton wrote:
>>
>>> Sebastian Böck wrote:
>>>
>>>> Richard Huxton wrote:
>>>>
>>>>> Can you post the output from your "explain analyse" calls too? The
>>>>> statistics aren't going to be the same on different machines.
>>>>>
>>>>
>>>> Sure, here it is.
>>>
>>>
>>>
>>>
>>> Thanks. (PS - remember to cc the list too).
>>
>>
>>
>> [output of EXPLAIN ANALYZE]
>>
>>> OK - so what you want to know is why index "test_999" is used in the
>>> second but not the first, even though both return the same rows.
>>>
>>> The fact is that the conditional index:
>>> CREATE INDEX test_999 ON test (datum)
>>> WHERE version = '999' OR approved IS NOT NULL;
>>> AFAIK looks at the WHERE clause of your query to determine where it
>>> can run. Don't forget that the planner needs to pick which index is
>>> best *before* it starts fetching data.
>>>
>>> So - in the first example there might be rows where e.g.
>>> t.version=998 which means test_999 would be a poor choice of index.
>>
>>
>>
>> But what if the table users contains only 1 row and the column "version"
>> has a value of "999"?
>
>
> It still doesn't know that the only value in "version" is 999(*). Let's
> say there were 2000 rows and 1900 had the value 999 - the index is still
> useless because we'd have to do a sequential scan to check the remaining
> 200 rows.
>
>> Are there any other options to speed up this kind of query?
>
>
> Well, your problem is the (version=X OR approved IS NOT NULL) clause. I
> must admit I can't quite see what this is supposed to do. The "test"
> table connects to the "users" table via "version" (and "datum", though
> not a simple check) unless the "test" has been "approved", in which case
> it applies to all users?
> Can you explain what the various tables/columns are really for?

The whole thing is a multiuser facility managment application.
Every user can plan things like he wants (different versions).
All these changes apply to a common (approved) version.
Things get complicated as everybody should be able to "travel"
through the history via the "datum" field.

That's why i need this "silly OR" in my where-clause.

At the moment i get very exciting results using immutable
functions, but i have another question.

In the docs it is stated that:

IMMUTABLE indicates that the function always returns the same
result when given the same argument values;

What if i define my functions like:

CREATE OR REPLACE FUNCTION datum () RETURNS TIMESTAMP AS '
SELECT datum FROM public.benutzer;
' LANGUAGE sql IMMUTABLE;

They normally (untill now) give the correct results,
also if the values in the underlaying view changes.

Can i relay on this or is it only luck.

> (*) Don't forget the statistics for column values are usually
> out-of-date compared to the actual data, so you can't rely on it.

I'm aware of that.

Thanks

Sebastian

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Bruce Momjian 2004-12-13 14:21:43 Re: Temporary tables and disk activity
Previous Message Richard Huxton 2004-12-13 13:13:46 Re: index not used in joins