Re: Failure to use indexes

From: Edmund Dengler <edmundd(at)eSentire(dot)com>
To: Postgresql-General <pgsql-general(at)postgresql(dot)org>
Subject: Re: Failure to use indexes
Date: 2005-07-29 18:23:29
Message-ID: Pine.BSO.4.58.0507291421270.23655@cyclops4.esentire.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Greetings!

I managed to get an even simpler example:

select *
from eventlog.record_main
where luid in (
select luid from eventlog_partition._20050706__raw_record
order by luid limit 1
)
;

If I use <=> rather than <in>, postgresql uses index scanning. As soon as
I use <in> (ie, a set to compare against), I get sequential scanning,
event though the set size is only a single element.

Regards!
Ed

On Fri, 29 Jul 2005, Edmund Dengler wrote:

> Greetings!
>
> I am using <inherits> to partition several tables. When I perform a query
> on another table, and then try to join against an inherited table set, the
> optimizer does not use any indexes to perform the join.
>
> This is 8.0.1. Is this fixed in 8.0.2 or 8.0.3?
>
> The query:
>
> explain
> select *
> from (
> select * from eventlog.record_classification as record_classification
> where
> time_written >= '2005-07-06 00:00:00+00'::timestamptz
> and time_written < '2005-07-06 00:00:00+00'::timestamptz
> order by time_written, luid
> offset 0
> limit 500
> ) as classification
> join eventlog.record_main as main using (luid, time_written)
> ;
>
> The explanation:
>
> QUERY PLAN
> -----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
> Hash Join (cost=51.15..20191003.89 rows=208027 width=178)
> Hash Cond: (("outer".luid = "inner".luid) AND ("outer".time_written = "inner".time_written))
> -> Append (cost=0.00..14641488.64 rows=554738383 width=96)
> -> Seq Scan on record_main main (cost=0.00..0.00 rows=1 width=96)
> -> Seq Scan on _20050723__record_main main (cost=0.00..94078.62 rows=3564462 width=96)
> -> Seq Scan on _20050724__record_main main (cost=0.00..110075.12 rows=4170512 width=96)
> -> Seq Scan on _20050725__record_main main (cost=0.00..122836.02 rows=4654002 width=96)
> -> Seq Scan on _20050726__record_main main (cost=0.00..142347.71 rows=5393271 width=96)
> -> Seq Scan on _20050727__record_main main (cost=0.00..130858.80 rows=4957980 width=96)
> ....
> (and so on, currently 123 such inheritd tables)
> ....
> -> Hash (cost=51.07..51.07 rows=15 width=98)
> -> Subquery Scan classification (cost=50.89..51.07 rows=15 width=98)
> -> Limit (cost=50.89..50.92 rows=15 width=98)
> -> Sort (cost=50.89..50.92 rows=15 width=98)
> Sort Key: record_classification.time_written, record_classification.luid
> -> Result (cost=0.00..50.59 rows=15 width=98)
> -> Append (cost=0.00..50.59 rows=15 width=98)
> -> Seq Scan on record_classification (cost=0.00..0.00 rows=1 width=98)
> Filter: ((time_written >= '2005-07-06 00:00:00+00'::timestamp with time zone) AND (time_written < '2005-07-06 00:00:00+00'::timestamp with time zone))
> -> Index Scan using _20050705__record_classification_time_written_idx on _20050705__record_classification record_classification (cost=0.00..3.46 rows=1 width=54)
> Index Cond: ((time_written >= '2005-07-06 00:00:00+00'::timestamp with time zone) AND (time_written < '2005-07-06 00:00:00+00'::timestamp with time zone))
> -> Index Scan using _20050701__record_classification_time_written_idx on _20050701__record_classification record_classification (cost=0.00..3.59 rows=1 width=54)
> Index Cond: ((time_written >= '2005-07-06 00:00:00+00'::timestamp with time zone) AND (time_written < '2005-07-06 00:00:00+00'::timestamp with time zone))
> -> Index Scan using _20050702__record_classification_time_written_idx on _20050702__record_classification record_classification (cost=0.00..3.69 rows=1 width=54)
> Index Cond: ((time_written >= '2005-07-06 00:00:00+00'::timestamp with time zone) AND (time_written < '2005-07-06 00:00:00+00'::timestamp with time zone))
> -> Index Scan using _20050703__record_classification_time_written_idx on _20050703__record_classification record_classification (cost=0.00..3.70 rows=1 width=54)
> Index Cond: ((time_written >= '2005-07-06 00:00:00+00'::timestamp with time zone) AND (time_written < '2005-07-06 00:00:00+00'::timestamp with time zone))
> ....
> (and so on)
> ....
> -> Index Scan using _20050714__record_classification_time_written_idx on _20050714__record_classification record_classification (cost=0.00..3.69 rows=1 width=53)
> Index Cond: ((time_written >= '2005-07-06 00:00:00+00'::timestamp with time zone) AND (time_written < '2005-07-06 00:00:00+00'::timestamp with time zone))
> (164 rows)
>
> Sample tables:
>
> eventlog=# \d eventlog_partition._20050723__record_main
> Table "eventlog_partition._20050723__record_main"
> Column | Type | Modifiers
> ----------------------+--------------------------+-----------
> luid | bigint | not null
> host_luid | integer | not null
> log_luid | integer | not null
> time_logged | timestamp with time zone | not null
> record_number | bigint | not null
> time_generated_epoch | bigint | not null
> time_generated | timestamp with time zone | not null
> time_written_epoch | bigint | not null
> time_written | timestamp with time zone | not null
> event_id | bigint | not null
> event_type | integer | not null
> event_category | integer | not null
> source_luid | integer | not null
> computer_luid | integer | not null
> sid_luid | integer |
> message_luid | integer |
> Indexes:
> "_20050723__record_main_message_idx" UNIQUE, btree (message_luid, luid)
> "_20050723__record_main_sid_idx" UNIQUE, btree (sid_luid, luid)
> "_20050723__record_main_time_generated_idx" UNIQUE, btree (time_generated, luid)
> "_20050723__record_main_time_logged_idx" UNIQUE, btree (time_logged, luid)
> "_20050723__record_main_time_written_idx" UNIQUE, btree (time_written, luid)
> "_20050723__record_main_pkey" btree (luid)
> Inherits: record_main
>
>
> eventlog=# \d eventlog.record_classification
> Table "eventlog.record_classification"
> Column | Type | Modifiers
> ----------------+--------------------------+-----------
> luid | bigint | not null
> class_luid | integer | not null
> time_written | timestamp with time zone | not null
> account_luid | integer |
> group_luid | integer |
> caller_luid | integer |
> source_machine | character varying(30) |
> source_ip | character varying(30) |
> Indexes:
> "record_classification_pkey" PRIMARY KEY, btree (luid)
> "record_classification_account_idx" btree (account_luid, time_written)
> "record_classification_caller_idx" btree (caller_luid, time_written)
> "record_classification_class_idx" btree (class_luid, time_written)
> "record_classification_group_idx" btree (group_luid, time_written)
> "record_classification_ip_idx" btree (source_ip, time_written)
> "record_classification_machine_idx" btree (source_machine, time_written)
> Foreign-key constraints:
> "record_classification_class_luid_fkey" FOREIGN KEY (class_luid) REFERENCES eventlog.classification(luid)
> "record_classification_account_luid_fkey" FOREIGN KEY (account_luid) REFERENCES eventlog.account(luid)
> "record_classification_group_luid_fkey" FOREIGN KEY (group_luid) REFERENCES eventlog.account(luid)
> "record_classification_caller_luid_fkey" FOREIGN KEY (caller_luid) REFERENCES eventlog.account(luid)
>
>
> <luid> represents a unique key. I have tried the query using just <luid>
> as the join condition, same result.
>
> The system can use the <_XXXXXXXX__record_main_pkey> index on each
> <_XXXXXXXX__record_main> table to do an index scan on <luid>, but instead
> it chooses to do a sequential scan. This is true whether
> <_XXXXXXXX__record_main_pkey> is specified as unique or not.
>
> I have "vacuum analyze" the entire database before running the queries. I
> have set the <default_statistics_target> in postgresql.conf to 100 to
> obtain more accurate statistics.
>
> If I specify a specific sub-table (ie,
> <eventlog_partition._XXXXXXXX__record_main>), in the join, the optimizer
> uses the indexes to speed search.
>
> Any ideas?
>
> Regards!
> Ed
>
> ---------------------------(end of broadcast)---------------------------
> TIP 1: if posting/reading through Usenet, please send an appropriate
> subscribe-nomail command to majordomo(at)postgresql(dot)org so that your
> message can get through to the mailing list cleanly
>

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Phil Endecott 2005-07-29 19:15:21 Re: Megabytes of stats saved after every connection
Previous Message Jonathan Villa 2005-07-29 18:18:56 how to select