Question about a query plan

From: Bill Moseley <moseley(at)hank(dot)org>
To: pgsql-general(at)postgresql(dot)org
Subject: Question about a query plan
Date: 2005-09-16 14:18:25
Message-ID: 20050916141825.GA17486@hank.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general


PostgreSQL 7.4.8 on i486-pc-linux-gnu, Debian
Reading: http://www.postgresql.org/docs/8.0/static/performance-tips.html

I'm just starting to look at query plans, and I'm not understanding a
few things. I don't have that many questions, but I'm including my
examples below, so it's a bit long.

First table is "class" (as in a class taught at a school) and has an
indexed column "class_time" as timestamp(0) with time zone.

First question is why the planner is not using an index scan when I
use "now()" or CURRENT_TIMESTAMP?

EXPLAIN ANALYZE select id from class where class_time > now();
QUERY PLAN
----------------------------------------------------------------------------------------------------------
Seq Scan on "class" (cost=0.00..655.62 rows=414 width=4) (actual time=213.327..288.407 rows=28 loops=1)
Filter: (class_time > now())

EXPLAIN ANALYZE select id from class where class_time > now()::timestamp(0) with time zone;
QUERY PLAN
------------------------------------------------------------------------------------------------------
Seq Scan on "class" (cost=0.00..658.72 rows=414 width=4) (actual time=2.065..5.251 rows=28 loops=1)
Filter: (class_time > (now())::timestamp(0) with time zone)

At first I thought the planner was related to the ration of rows the
planner was expecting to return to the total number of rows. But
using < or > uses a a scan. But if I do "class_time = now()" then it
uses an Index Scan.

But, if I specify the timestamp then it always uses an Index Scan:

select now()::timestamp(0) with time zone;
now
------------------------
2005-09-16 06:44:10-07

EXPLAIN ANALYZE select id from class where class_time > '2005-09-16 06:44:10-07';
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------------
Index Scan using class_class_time_index on "class" (cost=0.00..191.17 rows=50 width=4) (actual time=66.072..66.248 rows=28 loops=1)
Index Cond: (class_time > '2005-09-16 06:44:10-07'::timestamp with time zone)

Ok now on to the second question. I have two other related tables.
First, I have a table "person" which you can guess what it holds. And
a link table instructors (a class can have more than one instructor):

\d instructors
Table "public.instructors"
Column | Type | Modifiers
--------+---------+-----------
person | integer | not null
class | integer | not null
Indexes:
"instructors_pkey" primary key, btree (person, "class")
"instructors_class_index" btree ("class")
"instructors_person_index" btree (person)
Foreign-key constraints:
"$2" FOREIGN KEY ("class") REFERENCES "class"(id)
"$1" FOREIGN KEY (person) REFERENCES person(id)

I want to find out who is teaching classes in the future:

EXPLAIN ANALYZE select person, class from instructors
where instructors.class in
(select class.id from class where class_time > now());
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------
Hash IN Join (cost=656.65..687.64 rows=437 width=8) (actual time=31.741..33.443 rows=29 loops=1)
Hash Cond: ("outer"."class" = "inner".id)
-> Seq Scan on instructors (cost=0.00..20.08 rows=1308 width=8) (actual time=0.057..1.433 rows=1308 loops=1)
-> Hash (cost=655.62..655.62 rows=414 width=4) (actual time=30.963..30.963 rows=0 loops=1)
-> Seq Scan on "class" (cost=0.00..655.62 rows=414 width=4) (actual time=18.716..30.892 rows=28 loops=1)
Filter: (class_time > now())

Perhaps I'm reading that incorrectly, but the sub-select is returning
28 rows of "class.id". Then why is it doing a Seq Scan on instructors
instead of an index scan? If I innumerate all 28 classes I get an
Index Scan.

Finally, not really a question, but my goal is to show a count of classes taught by each in
instructor. Perhaps there's a better query?

EXPLAIN select person, first_name, count(class)
from instructors, person
where instructors.class in
(select id from class where class_time > now() )
AND person.id = instructors.person
group by person, first_name;
QUERY PLAN
----------------------------------------------------------------------------------------------
HashAggregate (cost=734.06..735.15 rows=437 width=17)
-> Merge Join (cost=706.81..730.78 rows=437 width=17)
Merge Cond: ("outer".id = "inner".person)
-> Index Scan using person_pkey on person (cost=0.00..1703.82 rows=12246 width=13)
-> Sort (cost=706.81..707.90 rows=437 width=8)
Sort Key: instructors.person
-> Hash IN Join (cost=656.65..687.64 rows=437 width=8)
Hash Cond: ("outer"."class" = "inner".id)
-> Seq Scan on instructors (cost=0.00..20.08 rows=1308 width=8)
-> Hash (cost=655.62..655.62 rows=414 width=4)
-> Seq Scan on "class" (cost=0.00..655.62 rows=414 width=4)
Filter: (class_time > now())

--
Bill Moseley
moseley(at)hank(dot)org

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Tom Lane 2005-09-16 14:23:11 Re: new.id has wrong value in INSERT RULE
Previous Message A. Kretschmer 2005-09-16 14:16:15 Re: help needed for functions