| From: | Bill Moseley <moseley(at)hank(dot)org> | 
|---|---|
| To: | pgsql-general(at)postgresql(dot)org | 
| Subject: | Question about a query plan | 
| Date: | 2005-09-16 14:18:25 | 
| Message-ID: | 20050916141825.GA17486@hank.org | 
| Views: | Whole Thread | Raw Message | Download mbox | Resend email | 
| Thread: | |
| Lists: | pgsql-general | 
PostgreSQL 7.4.8 on i486-pc-linux-gnu, Debian
Reading: http://www.postgresql.org/docs/8.0/static/performance-tips.html
I'm just starting to look at query plans, and I'm not understanding a
few things.  I don't have that many questions, but I'm including my
examples below, so it's a bit long.
First table is "class" (as in a class taught at a school) and has an
indexed column "class_time" as timestamp(0) with time zone.
First question is why the planner is not using an index scan when I
use "now()" or CURRENT_TIMESTAMP?
    EXPLAIN ANALYZE select id from class where class_time > now();                   
                                                    QUERY PLAN                                                
    ----------------------------------------------------------------------------------------------------------
     Seq Scan on "class"  (cost=0.00..655.62 rows=414 width=4) (actual time=213.327..288.407 rows=28 loops=1)
       Filter: (class_time > now())
    EXPLAIN ANALYZE select id from class where class_time > now()::timestamp(0) with time zone;    
                                                  QUERY PLAN                                              
    ------------------------------------------------------------------------------------------------------
     Seq Scan on "class"  (cost=0.00..658.72 rows=414 width=4) (actual time=2.065..5.251 rows=28 loops=1)
       Filter: (class_time > (now())::timestamp(0) with time zone)
At first I thought the planner was related to the ration of rows the
planner was expecting to return to the total number of rows.  But
using < or > uses a a scan.  But if I do "class_time = now()" then it
uses an Index Scan.
But, if I specify the timestamp then it always uses an Index Scan:
    select now()::timestamp(0) with time zone;
              now           
    ------------------------
     2005-09-16 06:44:10-07
     EXPLAIN ANALYZE select id from class where class_time > '2005-09-16 06:44:10-07';          
                                                                  QUERY PLAN                                                              
    --------------------------------------------------------------------------------------------------------------------------------------
     Index Scan using class_class_time_index on "class"  (cost=0.00..191.17 rows=50 width=4) (actual time=66.072..66.248 rows=28 loops=1)
       Index Cond: (class_time > '2005-09-16 06:44:10-07'::timestamp with time zone)
Ok now on to the second question.  I have two other related tables.
First, I have a table "person" which you can guess what it holds.  And
a link table instructors (a class can have more than one instructor):
     \d instructors
      Table "public.instructors"
     Column |  Type   | Modifiers 
    --------+---------+-----------
     person | integer | not null
     class  | integer | not null
    Indexes:
        "instructors_pkey" primary key, btree (person, "class")
        "instructors_class_index" btree ("class")
        "instructors_person_index" btree (person)
    Foreign-key constraints:
        "$2" FOREIGN KEY ("class") REFERENCES "class"(id)
        "$1" FOREIGN KEY (person) REFERENCES person(id)
I want to find out who is teaching classes in the future:
EXPLAIN ANALYZE select person, class from instructors 
                where instructors.class in 
                    (select class.id from class where class_time > now());
                                                    QUERY PLAN                                                    
--------------------------------------------------------------------------------------------------------------------
 Hash IN Join  (cost=656.65..687.64 rows=437 width=8) (actual time=31.741..33.443 rows=29 loops=1)
   Hash Cond: ("outer"."class" = "inner".id)
   ->  Seq Scan on instructors  (cost=0.00..20.08 rows=1308 width=8) (actual time=0.057..1.433 rows=1308 loops=1)
   ->  Hash  (cost=655.62..655.62 rows=414 width=4) (actual time=30.963..30.963 rows=0 loops=1)
         ->  Seq Scan on "class"  (cost=0.00..655.62 rows=414 width=4) (actual time=18.716..30.892 rows=28 loops=1)
               Filter: (class_time > now())
Perhaps I'm reading that incorrectly, but the sub-select is returning
28 rows of "class.id".  Then why is it doing a Seq Scan on instructors
instead of an index scan?  If I innumerate all 28 classes I get an
Index Scan.
Finally, not really a question, but my goal is to show a count of classes taught by each in
instructor.  Perhaps there's a better query?
EXPLAIN select person, first_name, count(class) 
        from instructors, person
        where instructors.class in 
            (select id from class where class_time > now() ) 
            AND person.id = instructors.person
            group by person, first_name;
                                          QUERY PLAN                                          
----------------------------------------------------------------------------------------------
 HashAggregate  (cost=734.06..735.15 rows=437 width=17)
   ->  Merge Join  (cost=706.81..730.78 rows=437 width=17)
         Merge Cond: ("outer".id = "inner".person)
         ->  Index Scan using person_pkey on person  (cost=0.00..1703.82 rows=12246 width=13)
         ->  Sort  (cost=706.81..707.90 rows=437 width=8)
               Sort Key: instructors.person
               ->  Hash IN Join  (cost=656.65..687.64 rows=437 width=8)
                     Hash Cond: ("outer"."class" = "inner".id)
                     ->  Seq Scan on instructors  (cost=0.00..20.08 rows=1308 width=8)
                     ->  Hash  (cost=655.62..655.62 rows=414 width=4)
                           ->  Seq Scan on "class"  (cost=0.00..655.62 rows=414 width=4)
                                 Filter: (class_time > now())
-- 
Bill Moseley
moseley(at)hank(dot)org
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Tom Lane | 2005-09-16 14:23:11 | Re: new.id has wrong value in INSERT RULE | 
| Previous Message | A. Kretschmer | 2005-09-16 14:16:15 | Re: help needed for functions |