Battleground was a machine running CentOS 5.5 and the two towers were 2
databases. First, PostgreSQL 8.4.4:
scott=# select count(*) from emp
e1,emp e2, emp e3, emp e4,emp e5, emp e6;
count
---------
7529536
(1 row)
Time: 2127.345 ms
scott=# select version();
version
--------------------------------------------------------------------------------
--------------------------------
PostgreSQL 8.4.4 on i686-redhat-linux-gnu,
compiled by GCC gcc (GCC) 4.1.2 2008
0704 (Red Hat 4.1.2-46), 32-bit
(1 row)
Time: 1.008 ms
scott=# explain analyze
scott-# select count(*) from emp e1,emp e2,
emp e3, emp e4,emp e5, emp e6;
QUERY PLAN
--------------------------------------------------------------------------------
---------------------------------------------------------
Aggregate
(cost=1052615959933232.62..1052615959933232.62 rows=1 width=0) (actu
al time=13684.811..13684.811 rows=1 loops=1)
-> Nested Loop
(cost=16.60..935975959933232.62 rows=46656000000000000 width
=0) (actual time=0.038..10508.108
rows=7529536 loops=1)
-> Nested Loop
(cost=0.00..2855959933216.00 rows=77760000000000 width
=0) (actual time=0.025..820.829 rows=537824
loops=1)
-> Nested Loop
(cost=0.00..4759933216.00 rows=129600000000 widt
h=0) (actual time=0.021..59.701 rows=38416
loops=1)
-> Nested Loop
(cost=0.00..7933216.00 rows=216000000 widt
h=0) (actual time=0.018..4.424 rows=2744
loops=1)
-> Nested
Loop (cost=0.00..13216.00 rows=360000 wid
th=0) (actual time=0.015..0.333 rows=196
loops=1)
-> Seq
Scan on emp e1 (cost=0.00..16.00 rows=
600 width=0) (actual time=0.009..0.018
rows=14 loops=1)
-> Seq
Scan on emp e6 (cost=0.00..16.00 rows=
600 width=0) (actual time=0.002..0.010
rows=14 loops=14)
-> Seq Scan
on emp e5 (cost=0.00..16.00 rows=600 wi
dth=0) (actual time=0.002..0.008 rows=14
loops=196)
-> Seq Scan on emp
e4 (cost=0.00..16.00 rows=600 width=0)
(actual time=0.001..0.007 rows=14
loops=2744)
-> Seq Scan on emp e3
(cost=0.00..16.00 rows=600 width=0) (actu
al time=0.001..0.007 rows=14 loops=38416)
-> Materialize
(cost=16.60..22.60 rows=600 width=0) (actual time=0.00
0..0.006 rows=14 loops=537824)
-> Seq Scan on emp e2
(cost=0.00..16.00 rows=600 width=0) (actu
al time=0.003..0.012 rows=14 loops=1)
Total runtime: 13684.977 ms
(14 rows)
Time: 13714.576 ms
scott=# \d+ emp
Table
"public.emp"
Column | Type |
Modifiers | Storage | Description
----------+-----------------------------+-----------+----------+-------------
empno | smallint |
not null | plain |
ename | character varying(10) |
not null | extended |
job | character varying(9)
| | extended |
mgr | smallint
| | plain |
hiredate | timestamp without time zone
| | plain |
sal | double precision
| | plain |
comm | double precision
| | plain |
deptno | smallint
| | plain |
Indexes:
"emp_pkey" PRIMARY KEY, btree (empno)
"emp_deptno_i" btree (deptno)
"emp_ename_id" btree (ename)
"ind_emp_deptno" btree (deptno)
Foreign-key constraints:
"fk_deptno" FOREIGN KEY (deptno)
REFERENCES dept(deptno)
Has OIDs: no
Now, Oracle 11.2.0.1 with the April patch:
SQL> set autotrace on
SQL> select count(*) from emp e1,emp e2,
emp e3, emp e4,emp e5, emp e6;
COUNT(*)
----------
7529536
Elapsed: 00:00:00.85
Execution Plan
----------------------------------------------------------
Plan hash value: 1049923164
------------------------------------------------------------------------------
| Id | Operation |
Name | Rows | Cost (%CPU)| Time |
------------------------------------------------------------------------------
| 0 | SELECT STATEMENT
| | 1 | 157K (1)| 00:31:25 |
| 1 | SORT AGGREGATE
| | 1 | | |
| 2 | MERGE JOIN CARTESIAN
| | 7529K| 157K (1)| 00:31:25 |
| 3 | MERGE JOIN CARTESIAN
| | 537K| 11224 (1)| 00:02:15 |
| 4 | MERGE JOIN CARTESIAN
| | 38416 | 808 (1)| 00:00:10 |
| 5 | MERGE JOIN CARTESIAN
| | 2744 | 62 (0)| 00:00:01 |
| 6 | MERGE JOIN CARTESIAN
| | 196 | 7 (0)| 00:00:01 |
| 7 | INDEX FULL SCAN |
PK_EMP | 14 | 1 (0)| 00:00:01 |
| 8 | BUFFER SORT
| | 14 | 6 (0)| 00:00:01 |
| 9 | INDEX FAST FULL SCAN|
PK_EMP | 14 | 0 (0)| 00:00:01 |
| 10 | BUFFER SORT
| | 14 | 62 (0)| 00:00:01 |
| 11 | INDEX FAST FULL SCAN |
PK_EMP | 14 | 0 (0)| 00:00:01 |
| 12 | BUFFER SORT
| | 14 | 808 (1)| 00:00:10 |
| 13 | INDEX FAST FULL SCAN |
PK_EMP | 14 | 0 (0)| 00:00:01 |
| 14 | BUFFER SORT
| | 14 | 11224 (1)| 00:02:15 |
| 15 | INDEX FAST FULL SCAN |
PK_EMP | 14 | 0 (0)| 00:00:01 |
| 16 | BUFFER SORT
| | 14 | 157K (1)| 00:31:25 |
| 17 | INDEX FAST FULL SCAN |
PK_EMP | 14 | 0 (0)| 00:00:01 |
------------------------------------------------------------------------------
Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
21 consistent gets
0 physical reads
0 redo size
425 bytes sent via SQL*Net to
client
419 bytes received via SQL*Net
from client
2 SQL*Net roundtrips to/from
client
5 sorts (memory)
0 sorts (disk)
1 rows processed
SQL>
SQL> select * from v$version;
BANNER
--------------------------------------------------------------------------------
Oracle Database 11g Enterprise Edition
Release 11.2.0.1.0 - Production
PL/SQL Release 11.2.0.1.0 - Production
CORE 11.2.0.1.0 Production
TNS for Linux: Version 11.2.0.1.0 -
Production
NLSRTL Version 11.2.0.1.0 - Production
Both databases have current statistics, the data is exactly the same,
as shown by the results. Oracle finished in 0.85 second (850
milliseconds) while PostgreSQL took 2127.345 ms, 2.5 times slower than
Oracle. The difference that is obvious is the access path: Postgres
chose nested loops, while Oracle chose merge join and utilized the
primary key for the table. Postgres, apparently cannot do "fast full
scan" of an index and doesn't know how to utilize the primary key in
cases like this. However, my gripe is that the optimizer should have
selected merge join, just like Oracle did. In this case, nested loops
are definitely the wrong choice. Rule based optimizers, the kind of the
optimizer that takes into consideration only the structure of the
table, usually ends up being dominated by the nested loops method.
Nested loops method usually dominates the OLTP type applications but
can really mess up large reports. I am under the impression that
Postgres query planner is geared toward the OLTP type of the database.
Maybe a new parameter is needed that would somehow shift gears to "data
warehouse use", on demand? I have to say, the advantage is still on the
side of Sauron.
--
Mladen Gogala
Sr. Oracle DBA
1500 Broadway
New York, NY 10036
(212) 329-5251
www.vmsinfo.com