Re: PostgreSQL using the wrong Index

From: John A Meinel <john(at)arbash-meinel(dot)com>
To: Alex Stapleton <alexs(at)advfn(dot)com>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: PostgreSQL using the wrong Index
Date: 2005-06-13 14:47:10
Message-ID: 42AD9C6E.3000705@arbash-meinel.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Alex Stapleton wrote:

> Oh, we are running 7.4.2 btw. And our random_page_cost = 1
>
Which is only correct if your entire db fits into memory. Also, try
updating to a later 7.4 version if at all possible.

> On 13 Jun 2005, at 14:02, Alex Stapleton wrote:
>
>> We have two index's like so
>>
>> l1_historical=# \d "N_intra_time_idx"
>> Index "N_intra_time_idx"
>> Column | Type
>> --------+-----------------------------
>> time | timestamp without time zone
>> btree
>>
Just so you are aware, writing this as: "We have an index on
N_intra(time) and one on N_Intra(symbol, time)" is a lot more succinct.

>>
>> l1_historical=# \d "N_intra_pkey"
>> Index "N_intra_pkey"
>> Column | Type
>> --------+-----------------------------
>> symbol | text
>> time | timestamp without time zone
>> unique btree (primary key)
>>
>> and on queries like this
>>
>> select * from "N_intra" where symbol='SOMETHING WHICH DOESNT EXIST'
>> order by time desc limit 1;
>>
>> PostgreSQL takes a very long time to complete, as it effectively
>> scans the entire table, backwards. And the table is huge, about 450
>> million rows. (btw, there are no triggers or any other exciting
>> things like that on our tables in this db.)
>>
>> but on things where the symbol does exist in the table, it's more or
>> less fine, and nice and fast.
>
What happens if you do:
SELECT * FROM "N_intra" WHERE symbol='doesnt exist' ORDER BY symbol,
time DESC LIMIT 1;

Yes, symbol is constant, but it frequently helps the planner realize it
can use an index scan if you include all terms in the index in the ORDER
BY clause.

>>
>> Whilst the option the planner has taken might be faster most of the
>> time, the worst case scenario is unacceptable for obvious reasons.
>> I've googled for trying to force the use of a specific index, but
>> can't find anything relevant. Does anyone have any suggestions on
>> getting it to use an index which hopefully will have better worst
>> case performance?
>
Try the above first. You could also create a new index on symbol
CREATE INDEX "N_intra_symbol_idx" ON "N_intra"(symbol);

Then the WHERE clause should use the symbol index, which means it can
know quickly that an entry doesn't exist. I'm not sure how many entries
you have per symbol, though, so this might cause problems in the ORDER
BY time portion.

I'm guessing what you really want is to just do the ORDER BY symbol, time.

John
=:->

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Tom Lane 2005-06-13 14:51:39 Re: How to enhance the chance that data is in disk cache
Previous Message Kevin Grittner 2005-06-13 14:34:33 Re: Index ot being used