Postgres Optimizer is not smart enough?

From: Litao Wu <litaowu(at)yahoo(dot)com>
To: pgsql-performance(at)postgresql(dot)org
Subject: Postgres Optimizer is not smart enough?
Date: 2005-01-12 22:25:06
Message-ID: 20050112222506.32084.qmail@web13124.mail.yahoo.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Hi All,

Here is my test comparison between Postgres (7.3.2)
optimizer vs Oracle (10g) optimizer.

It seems to me that Postgres optimizer is not smart
enough.

Did I miss anything?

Thanks,

In Postgres:
============
drop table test;
create table test (
module character varying(50),
action_deny integer,
created timestamp with time zone,
customer_id integer,
domain character varying(255));
create or replace function insert_rows () returns
integer as '
BEGIN
for i in 1 .. 500000 loop
insert into test values (i, 2, now(), 100, i);
end loop;
return 1;
END;
' LANGUAGE 'plpgsql';

select insert_rows();

create index test_id1 on test (customer_id, created,
domain);

analyze test;

explain analyze
SELECT module, sum(action_deny)
FROM test
WHERE created >= ('now'::timestamptz - '1
day'::interval) AND customer_id='100'
AND domain='100'
GROUP BY module;


QUERY PLAN

----------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Aggregate (cost=3.12..3.13 rows=1 width=9) (actual
time=91.05..91.05 rows=1 loops=1)
-> Group (cost=3.12..3.12 rows=1 width=9) (actual
time=91.04..91.04 rows=1 loops=1)
-> Sort (cost=3.12..3.12 rows=1 width=9)
(actual time=91.03..91.03 rows=1 loops=1)
Sort Key: module
-> Index Scan using test_id1 on test
(cost=0.00..3.11 rows=1 width=9) (actual
time=0.03..91.00 rows=1 loops=1)
Index Cond: ((customer_id = 100)
AND (created >= '2005-01-11
14:48:44.832552-07'::timestamp with time zone) AND
("domain" = '100'::character varying))
Total runtime: 91.13 msec
(7 rows)

create index test_id2 on test(domain);
analyze test;

explain analyze
SELECT module, sum(action_deny)
FROM test
WHERE created >= ('now'::timestamptz - '1
day'::interval) AND customer_id='100'
AND domain='100'
GROUP BY module;


QUERY PLAN

----------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Aggregate (cost=3.12..3.13 rows=1 width=9) (actual
time=90.30..90.30 rows=1 loops=1)
-> Group (cost=3.12..3.12 rows=1 width=9) (actual
time=90.29..90.30 rows=1 loops=1)
-> Sort (cost=3.12..3.12 rows=1 width=9)
(actual time=90.29..90.29 rows=1 loops=1)
Sort Key: module
-> Index Scan using test_id1 on test
(cost=0.00..3.11 rows=1 width=9) (actual
time=0.03..90.25 rows=1 loops=1)
Index Cond: ((customer_id = 100)
AND (created >= '2005-01-11
14:51:09.555974-07'::timestamp with time zone) AND
("domain" = '100'::character varying))
Total runtime: 90.38 msec
(7 rows)

WHY PG STILL CHOOSE INDEX test_id1???
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
BECAUSE QUERY WILL RUN MUCH FASTER USING test_id2!!!
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

drop index test_id1;
explain analyze
SELECT module, sum(action_deny)
FROM test
WHERE created >= ('now'::timestamptz - '1
day'::interval) AND customer_id='100'
AND domain='100'
GROUP BY module;

QUERY PLAN

-------------------------------------------------------------------------------------------------------------------------------
Aggregate (cost=3.12..3.13 rows=1 width=9) (actual
time=0.08..0.08 rows=1 loops=1)
-> Group (cost=3.12..3.13 rows=1 width=9) (actual
time=0.08..0.08 rows=1 loops=1)
-> Sort (cost=3.12..3.13 rows=1 width=9)
(actual time=0.07..0.07 rows=1 loops=1)
Sort Key: module
-> Index Scan using test_id2 on test
(cost=0.00..3.11 rows=1 width=9) (actual
time=0.04..0.05 rows=1 loops=1)
Index Cond: ("domain" =
'100'::character varying)
Filter: ((created >= '2005-01-11
14:53:58.806364-07'::timestamp with time zone) AND
(customer_id = 100))
Total runtime: 0.14 msec
(8 rows)

In Oracle:
==========
drop table test;
create table test (
module character varying(50),
action_deny integer,
created timestamp with time zone,
customer_id integer,
domain character varying(255));

begin
for i in 1..500000 loop
insert into test values (i, 2, current_timestamp,
100, i);
end loop;
end;
/

create index test_id1 on test (customer_id, created,
domain);

analyze table test compute statistics;

set autot on
set timing on

SELECT module, sum(action_deny)
FROM test
WHERE created >= (current_timestamp - interval '1'
day) AND customer_id=100
AND domain='100'
GROUP BY module
/

MODULE
SUM(ACTION_DENY)
--------------------------------------------------
----------------
100
2

Elapsed: 00:00:00.67

Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=ALL_ROWS (Cost=25
Card=1 Bytes=29
)

1 0 SORT (GROUP BY) (Cost=25 Card=1 Bytes=29)
2 1 TABLE ACCESS (BY INDEX ROWID) OF 'TEST'
(TABLE) (Cost=24
Card=1 Bytes=29)

3 2 INDEX (RANGE SCAN) OF 'TEST_ID1'
(INDEX) (Cost=23 Card
=4500)

Statistics
----------------------------------------------------------
1 recursive calls
0 db block gets
2292 consistent gets
2291 physical reads
0 redo size
461 bytes sent via SQL*Net to client
508 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
1 sorts (memory)
0 sorts (disk)
1 rows processed

create index test_id2 on test (domain);

SELECT module, sum(action_deny)
FROM test
WHERE created >= (current_timestamp - interval '1'
day) AND customer_id=100
AND domain='100'
GROUP BY module
/

MODULE
SUM(ACTION_DENY)
--------------------------------------------------
----------------
100
2

Elapsed: 00:00:00.03

Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=ALL_ROWS (Cost=5
Card=1 Bytes=29)
1 0 SORT (GROUP BY) (Cost=5 Card=1 Bytes=29)
2 1 TABLE ACCESS (BY INDEX ROWID) OF 'TEST'
(TABLE) (Cost=4
Card=1 Bytes=29)

3 2 INDEX (RANGE SCAN) OF 'TEST_ID2'
(INDEX) (Cost=3 Card=
1)

Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
4 consistent gets
0 physical reads
0 redo size
461 bytes sent via SQL*Net to client
508 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
1 sorts (memory)
0 sorts (disk)
1 rows processed


__________________________________
Do you Yahoo!?
All your favorites on one personal page Try My Yahoo!
http://my.yahoo.com

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Mike Mascari 2005-01-12 22:55:39 Re: Postgres Optimizer is not smart enough?
Previous Message Alex Turner 2005-01-12 17:36:45 Re: which dual-CPU hardware/OS is fastest for PostgreSQL?