From: | Bill Moseley <moseley(at)hank(dot)org> |
---|---|
To: | pgsql-general(at)postgresql(dot)org |
Subject: | Question about a query plan |
Date: | 2005-09-16 14:18:25 |
Message-ID: | 20050916141825.GA17486@hank.org |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
PostgreSQL 7.4.8 on i486-pc-linux-gnu, Debian
Reading: http://www.postgresql.org/docs/8.0/static/performance-tips.html
I'm just starting to look at query plans, and I'm not understanding a
few things. I don't have that many questions, but I'm including my
examples below, so it's a bit long.
First table is "class" (as in a class taught at a school) and has an
indexed column "class_time" as timestamp(0) with time zone.
First question is why the planner is not using an index scan when I
use "now()" or CURRENT_TIMESTAMP?
EXPLAIN ANALYZE select id from class where class_time > now();
QUERY PLAN
----------------------------------------------------------------------------------------------------------
Seq Scan on "class" (cost=0.00..655.62 rows=414 width=4) (actual time=213.327..288.407 rows=28 loops=1)
Filter: (class_time > now())
EXPLAIN ANALYZE select id from class where class_time > now()::timestamp(0) with time zone;
QUERY PLAN
------------------------------------------------------------------------------------------------------
Seq Scan on "class" (cost=0.00..658.72 rows=414 width=4) (actual time=2.065..5.251 rows=28 loops=1)
Filter: (class_time > (now())::timestamp(0) with time zone)
At first I thought the planner was related to the ration of rows the
planner was expecting to return to the total number of rows. But
using < or > uses a a scan. But if I do "class_time = now()" then it
uses an Index Scan.
But, if I specify the timestamp then it always uses an Index Scan:
select now()::timestamp(0) with time zone;
now
------------------------
2005-09-16 06:44:10-07
EXPLAIN ANALYZE select id from class where class_time > '2005-09-16 06:44:10-07';
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------------
Index Scan using class_class_time_index on "class" (cost=0.00..191.17 rows=50 width=4) (actual time=66.072..66.248 rows=28 loops=1)
Index Cond: (class_time > '2005-09-16 06:44:10-07'::timestamp with time zone)
Ok now on to the second question. I have two other related tables.
First, I have a table "person" which you can guess what it holds. And
a link table instructors (a class can have more than one instructor):
\d instructors
Table "public.instructors"
Column | Type | Modifiers
--------+---------+-----------
person | integer | not null
class | integer | not null
Indexes:
"instructors_pkey" primary key, btree (person, "class")
"instructors_class_index" btree ("class")
"instructors_person_index" btree (person)
Foreign-key constraints:
"$2" FOREIGN KEY ("class") REFERENCES "class"(id)
"$1" FOREIGN KEY (person) REFERENCES person(id)
I want to find out who is teaching classes in the future:
EXPLAIN ANALYZE select person, class from instructors
where instructors.class in
(select class.id from class where class_time > now());
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------
Hash IN Join (cost=656.65..687.64 rows=437 width=8) (actual time=31.741..33.443 rows=29 loops=1)
Hash Cond: ("outer"."class" = "inner".id)
-> Seq Scan on instructors (cost=0.00..20.08 rows=1308 width=8) (actual time=0.057..1.433 rows=1308 loops=1)
-> Hash (cost=655.62..655.62 rows=414 width=4) (actual time=30.963..30.963 rows=0 loops=1)
-> Seq Scan on "class" (cost=0.00..655.62 rows=414 width=4) (actual time=18.716..30.892 rows=28 loops=1)
Filter: (class_time > now())
Perhaps I'm reading that incorrectly, but the sub-select is returning
28 rows of "class.id". Then why is it doing a Seq Scan on instructors
instead of an index scan? If I innumerate all 28 classes I get an
Index Scan.
Finally, not really a question, but my goal is to show a count of classes taught by each in
instructor. Perhaps there's a better query?
EXPLAIN select person, first_name, count(class)
from instructors, person
where instructors.class in
(select id from class where class_time > now() )
AND person.id = instructors.person
group by person, first_name;
QUERY PLAN
----------------------------------------------------------------------------------------------
HashAggregate (cost=734.06..735.15 rows=437 width=17)
-> Merge Join (cost=706.81..730.78 rows=437 width=17)
Merge Cond: ("outer".id = "inner".person)
-> Index Scan using person_pkey on person (cost=0.00..1703.82 rows=12246 width=13)
-> Sort (cost=706.81..707.90 rows=437 width=8)
Sort Key: instructors.person
-> Hash IN Join (cost=656.65..687.64 rows=437 width=8)
Hash Cond: ("outer"."class" = "inner".id)
-> Seq Scan on instructors (cost=0.00..20.08 rows=1308 width=8)
-> Hash (cost=655.62..655.62 rows=414 width=4)
-> Seq Scan on "class" (cost=0.00..655.62 rows=414 width=4)
Filter: (class_time > now())
--
Bill Moseley
moseley(at)hank(dot)org
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2005-09-16 14:23:11 | Re: new.id has wrong value in INSERT RULE |
Previous Message | A. Kretschmer | 2005-09-16 14:16:15 | Re: help needed for functions |