Quick Links

Failure to use indexes

From:	Edmund Dengler <edmundd(at)eSentire(dot)com>
To:	Postgresql-General <pgsql-general(at)postgresql(dot)org>
Subject:	Failure to use indexes
Date:	2005-07-29 18:10:44
Message-ID:	Pine.BSO.4.58.0507291347200.23655@cyclops4.esentire.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-general

Greetings!

I am using <inherits> to partition several tables. When I perform a query
on another table, and then try to join against an inherited table set, the
optimizer does not use any indexes to perform the join.

This is 8.0.1. Is this fixed in 8.0.2 or 8.0.3?

The query:

explain
select *
from (
select * from eventlog.record_classification as record_classification
where
time_written >= '2005-07-06 00:00:00+00'::timestamptz
and time_written < '2005-07-06 00:00:00+00'::timestamptz
order by time_written, luid
offset 0
limit 500
) as classification
join eventlog.record_main as main using (luid, time_written)
;

The explanation:

QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Hash Join (cost=51.15..20191003.89 rows=208027 width=178)
Hash Cond: (("outer".luid = "inner".luid) AND ("outer".time_written = "inner".time_written))
-> Append (cost=0.00..14641488.64 rows=554738383 width=96)
-> Seq Scan on record_main main (cost=0.00..0.00 rows=1 width=96)
-> Seq Scan on _20050723__record_main main (cost=0.00..94078.62 rows=3564462 width=96)
-> Seq Scan on _20050724__record_main main (cost=0.00..110075.12 rows=4170512 width=96)
-> Seq Scan on _20050725__record_main main (cost=0.00..122836.02 rows=4654002 width=96)
-> Seq Scan on _20050726__record_main main (cost=0.00..142347.71 rows=5393271 width=96)
-> Seq Scan on _20050727__record_main main (cost=0.00..130858.80 rows=4957980 width=96)
....
(and so on, currently 123 such inheritd tables)
....
-> Hash (cost=51.07..51.07 rows=15 width=98)
-> Subquery Scan classification (cost=50.89..51.07 rows=15 width=98)
-> Limit (cost=50.89..50.92 rows=15 width=98)
-> Sort (cost=50.89..50.92 rows=15 width=98)
Sort Key: record_classification.time_written, record_classification.luid
-> Result (cost=0.00..50.59 rows=15 width=98)
-> Append (cost=0.00..50.59 rows=15 width=98)
-> Seq Scan on record_classification (cost=0.00..0.00 rows=1 width=98)
Filter: ((time_written >= '2005-07-06 00:00:00+00'::timestamp with time zone) AND (time_written < '2005-07-06 00:00:00+00'::timestamp with time zone))
-> Index Scan using _20050705__record_classification_time_written_idx on _20050705__record_classification record_classification (cost=0.00..3.46 rows=1 width=54)
Index Cond: ((time_written >= '2005-07-06 00:00:00+00'::timestamp with time zone) AND (time_written < '2005-07-06 00:00:00+00'::timestamp with time zone))
-> Index Scan using _20050701__record_classification_time_written_idx on _20050701__record_classification record_classification (cost=0.00..3.59 rows=1 width=54)
Index Cond: ((time_written >= '2005-07-06 00:00:00+00'::timestamp with time zone) AND (time_written < '2005-07-06 00:00:00+00'::timestamp with time zone))
-> Index Scan using _20050702__record_classification_time_written_idx on _20050702__record_classification record_classification (cost=0.00..3.69 rows=1 width=54)
Index Cond: ((time_written >= '2005-07-06 00:00:00+00'::timestamp with time zone) AND (time_written < '2005-07-06 00:00:00+00'::timestamp with time zone))
-> Index Scan using _20050703__record_classification_time_written_idx on _20050703__record_classification record_classification (cost=0.00..3.70 rows=1 width=54)
Index Cond: ((time_written >= '2005-07-06 00:00:00+00'::timestamp with time zone) AND (time_written < '2005-07-06 00:00:00+00'::timestamp with time zone))
....
(and so on)
....
-> Index Scan using _20050714__record_classification_time_written_idx on _20050714__record_classification record_classification (cost=0.00..3.69 rows=1 width=53)
Index Cond: ((time_written >= '2005-07-06 00:00:00+00'::timestamp with time zone) AND (time_written < '2005-07-06 00:00:00+00'::timestamp with time zone))
(164 rows)

Sample tables:

eventlog=# \d eventlog.record_classification
Table "eventlog.record_classification"
Column | Type | Modifiers
----------------+--------------------------+-----------
luid | bigint | not null
class_luid | integer | not null
time_written | timestamp with time zone | not null
account_luid | integer |
group_luid | integer |
caller_luid | integer |
source_machine | character varying(30) |
source_ip | character varying(30) |
Indexes:
"record_classification_pkey" PRIMARY KEY, btree (luid)
"record_classification_account_idx" btree (account_luid, time_written)
"record_classification_caller_idx" btree (caller_luid, time_written)
"record_classification_class_idx" btree (class_luid, time_written)
"record_classification_group_idx" btree (group_luid, time_written)
"record_classification_ip_idx" btree (source_ip, time_written)
"record_classification_machine_idx" btree (source_machine, time_written)
Foreign-key constraints:
"record_classification_class_luid_fkey" FOREIGN KEY (class_luid) REFERENCES eventlog.classification(luid)
"record_classification_account_luid_fkey" FOREIGN KEY (account_luid) REFERENCES eventlog.account(luid)
"record_classification_group_luid_fkey" FOREIGN KEY (group_luid) REFERENCES eventlog.account(luid)
"record_classification_caller_luid_fkey" FOREIGN KEY (caller_luid) REFERENCES eventlog.account(luid)

<luid> represents a unique key. I have tried the query using just <luid>
as the join condition, same result.

The system can use the <_XXXXXXXX__record_main_pkey> index on each
<_XXXXXXXX__record_main> table to do an index scan on <luid>, but instead
it chooses to do a sequential scan. This is true whether
<_XXXXXXXX__record_main_pkey> is specified as unique or not.

I have "vacuum analyze" the entire database before running the queries. I
have set the <default_statistics_target> in postgresql.conf to 100 to
obtain more accurate statistics.

If I specify a specific sub-table (ie,
<eventlog_partition._XXXXXXXX__record_main>), in the join, the optimizer
uses the indexes to speed search.

Any ideas?

Regards!
Ed

Responses

Re: Failure to use indexes at 2005-07-29 18:23:29 from Edmund Dengler

Browse pgsql-general by date

	From	Date	Subject
Next Message	Brian Wong	2005-07-29 18:15:33	Re: how to select
Previous Message	Bruno Wolff III	2005-07-29 17:13:17	Re: GUID for postgreSQL