Re: intersect performance (PG 7.1.3 vs 7.2)

From: d_nardini(at)btconnect(dot)com
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-sql(at)postgresql(dot)org
Subject: Re: intersect performance (PG 7.1.3 vs 7.2)
Date: 2002-04-06 09:49:16
Message-ID: 8F149658-4943-11D6-876E-0030654E696C@btconnect.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-sql

Tom,

Thanks for the info on the 'broken' intersect (the RH 7.1 box does give
the same results (eventually)) - apologies for the size of email in
advance ...

Following are EXPLAIN dumps from each platform for both intersect and
union (both return the correct data on each platform).

+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

OS X (PG 7.2) - intersect start-snip :

spdb=# explain select i.master_reference from image i, image_keyword ik,
keyword k
spdb-# where k.keyword = 'pen'
spdb-# and ik.keyword_id = k.keyword_id
spdb-# and i.image_id = ik.image_id
spdb-# intersect
spdb-# select i.master_reference from image i, image_keyword ik,
keyword k
spdb-# where k.keyword = 'cheque'
spdb-# and ik.keyword_id = k.keyword_id
spdb-# and i.image_id = ik.image_id
spdb-# intersect
spdb-# select i.master_reference from image i, image_keyword ik,
keyword k
spdb-# where k.keyword = 'purchase'
spdb-# and ik.keyword_id = k.keyword_id
spdb-# and i.image_id = ik.image_id
spdb-# order by master_reference
spdb-# limit 12, 0;
NOTICE: QUERY PLAN:
Limit (cost=80.41..80.41 rows=1 width=46)
-> Sort (cost=80.41..80.41 rows=1 width=46)
-> SetOp Intersect (cost=80.39..80.40 rows=1 width=46)
-> Sort (cost=80.39..80.39 rows=2 width=46)
-> Append (cost=53.59..80.38 rows=2 width=46)
-> Result (cost=53.59..53.59 rows=1 width=46)
-> SetOp Intersect (cost=53.59..53.59
rows=1 width=46)
-> Sort (cost=53.59..53.59
rows=2 width=46)
-> Append
(cost=0.00..53.58 rows=2 width=46)
-> Subquery Scan
*SELECT* 1 (cost=0.00..26.79 rows=1 width=46)
-> Nested Loop
(cost=0.00..26.79 rows=1 width=46)
-> Nested
Loop (cost=0.00..21.95 rows=1 width=24)
->
Index Scan using keyword_keyword_ix on keyword k (cost=0.00..4.82
rows=1 width=8)
->
Index Scan using image_keyword_keyword_id_ix on image_keyword ik
(cost=0.00..17.07 rows=5 width=16)
-> Index
Scan using image_pkey on image i (cost=0.00..4.82 rows=1 width=22)
-> Subquery Scan
*SELECT* 2 (cost=0.00..26.79 rows=1 width=46)
-> Nested Loop
(cost=0.00..26.79 rows=1 width=46)
-> Nested
Loop (cost=0.00..21.95 rows=1 width=24)
->
Index Scan using keyword_keyword_ix on keyword k (cost=0.00..4.82
rows=1 width=8)
->
Index Scan using image_keyword_keyword_id_ix on image_keyword ik
(cost=0.00..17.07 rows=5 width=16)
-> Index
Scan using image_pkey on image i (cost=0.00..4.82 rows=1 width=22)
-> Subquery Scan *SELECT* 3
(cost=0.00..26.79 rows=1 width=46)
-> Nested Loop (cost=0.00..26.79
rows=1 width=46)
-> Nested Loop (cost=0.00..21.95
rows=1 width=24)
-> Index Scan using
keyword_keyword_ix on keyword k (cost=0.00..4.82 rows=1 width=8)
-> Index Scan using
image_keyword_keyword_id_ix on image_keyword ik (cost=0.00..17.07
rows=5 width=16)
-> Index Scan using image_pkey on
image i (cost=0.00..4.82 rows=1 width=22)
EXPLAIN

master_reference
------------------
0635-00003
1060-00018
1060-00019
(3 rows)

OS X (PG 7.2) - intersect end-snip :

+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

RH 7.1 (PG 7.1.3) - intersect start-snip :

spdb=# explain select i.master_reference from image i, image_keyword ik,
keyword k
spdb-# where k.keyword = 'pen'
spdb-# and ik.keyword_id = k.keyword_id
spdb-# and i.image_id = ik.image_id
spdb-# intersect
spdb-# select i.master_reference from image i, image_keyword ik,
keyword k
spdb-# where k.keyword = 'cheque'
spdb-# and ik.keyword_id = k.keyword_id
spdb-# and i.image_id = ik.image_id
spdb-# intersect
spdb-# select i.master_reference from image i, image_keyword ik,
keyword k
spdb-# where k.keyword = 'purchase'
spdb-# and ik.keyword_id = k.keyword_id
spdb-# and i.image_id = ik.image_id
spdb-# order by master_reference
spdb-# limit 12, 0;
NOTICE: QUERY PLAN:
Limit (cost=651.20..651.20 rows=12 width=44)
-> Sort (cost=651.20..651.20 rows=120 width=44)
-> SetOp Intersect (cost=644.06..647.06 rows=120 width=44)
-> Sort (cost=644.06..644.06 rows=1200 width=44)
-> Append (cost=421.67..582.68 rows=1200 width=44)
-> Result (cost=421.67..426.67 rows=200
width=44)
-> SetOp Intersect
(cost=421.67..426.67 rows=200 width=44)
-> Sort (cost=421.67..421.67
rows=2000 width=44)
-> Append
(cost=83.25..312.02 rows=2000 width=44)
-> Subquery Scan
*SELECT* 1 (cost=83.25..156.01 rows=1000 width=44)
-> Merge Join
(cost=83.25..156.01 rows=1000 width=44)
-> Index
Scan using image_pkey on image i (cost=0.00..59.00 rows=1000 width=20)
-> Sort
(cost=83.25..83.25 rows=100 width=24)
->
Merge Join (cost=8.30..79.93 rows=100 width=24)

-> Index Scan using image_keyword_keyword_id_ix on image_keyword ik
(cost=0.00..59.00 rows=1000 width=16)

-> Sort (cost=8.30..8.30 rows=10 width=8)

-> Index Scan using keyword_keyword_ix on keyword k
(cost=0.00..8.14 rows=10 width=8)
-> Subquery Scan
*SELECT* 2 (cost=83.25..156.01 rows=1000 width=44)
-> Merge Join
(cost=83.25..156.01 rows=1000 width=44)
-> Index
Scan using image_pkey on image i (cost=0.00..59.00 rows=1000 width=20)
-> Sort
(cost=83.25..83.25 rows=100 width=24)
->
Merge Join (cost=8.30..79.93 rows=100 width=24)

-> Index Scan using image_keyword_keyword_id_ix on image_keyword ik
(cost=0.00..59.00 rows=1000 width=16)

-> Sort (cost=8.30..8.30 rows=10 width=8)

-> Index Scan using keyword_keyword_ix on keyword k
(cost=0.00..8.14 rows=10 width=8)
-> Subquery Scan *SELECT* 3
(cost=83.25..156.01 rows=1000 width=44)
-> Merge Join (cost=83.25..156.01
rows=1000 width=44)
-> Index Scan using image_pkey on
image i (cost=0.00..59.00 rows=1000 width=20)
-> Sort (cost=83.25..83.25
rows=100 width=24)
-> Merge Join
(cost=8.30..79.93 rows=100 width=24)
-> Index Scan using
image_keyword_keyword_id_ix on image_keyword ik (cost=0.00..59.00
rows=1000 width=16)
-> Sort
(cost=8.30..8.30 rows=10 width=8)
-> Index Scan
using keyword_keyword_ix on keyword k (cost=0.00..8.14 rows=10 width=8)
EXPLAIN

master_reference
------------------
0635-00003
1060-00018
1060-00019
(3 rows)

RH 7.1 (PG 7.1.3) - intersect end-snip :

+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

OS X (PG 7.2) - union start-snip :
spdb=# explain select i.master_reference from image i, image_keyword ik,
keyword k
spdb-# where k.keyword = 'pen'
spdb-# and ik.keyword_id = k.keyword_id
spdb-# and i.image_id = ik.image_id
spdb-# union
spdb-# select i.master_reference from image i, image_keyword ik,
keyword k
spdb-# where k.keyword = 'cheque'
spdb-# and ik.keyword_id = k.keyword_id
spdb-# and i.image_id = ik.image_id
spdb-# union
spdb-# select i.master_reference from image i, image_keyword ik,
keyword k
spdb-# where k.keyword = 'purchase'
spdb-# and ik.keyword_id = k.keyword_id
spdb-# and i.image_id = ik.image_id
spdb-# order by master_reference
spdb-# limit 12, 0;
NOTICE: QUERY PLAN:
Limit (cost=80.41..80.41 rows=1 width=46)
-> Sort (cost=80.41..80.41 rows=1 width=46)
-> Unique (cost=80.39..80.40 rows=1 width=46)
-> Sort (cost=80.39..80.39 rows=3 width=46)
-> Append (cost=0.00..80.37 rows=3 width=46)
-> Subquery Scan *SELECT* 1
(cost=0.00..26.79 rows=1 width=46)
-> Nested Loop (cost=0.00..26.79
rows=1 width=46)
-> Nested Loop (cost=0.00..21.95
rows=1 width=24)
-> Index Scan using
keyword_keyword_ix on keyword k (cost=0.00..4.82 rows=1 width=8)
-> Index Scan using
image_keyword_keyword_id_ix on image_keyword ik (cost=0.00..17.07
rows=5 width=16)
-> Index Scan using image_pkey on
image i (cost=0.00..4.82 rows=1 width=22)
-> Subquery Scan *SELECT* 2
(cost=0.00..26.79 rows=1 width=46)
-> Nested Loop (cost=0.00..26.79
rows=1 width=46)
-> Nested Loop (cost=0.00..21.95
rows=1 width=24)
-> Index Scan using
keyword_keyword_ix on keyword k (cost=0.00..4.82 rows=1 width=8)
-> Index Scan using
image_keyword_keyword_id_ix on image_keyword ik (cost=0.00..17.07
rows=5 width=16)
-> Index Scan using image_pkey on
image i (cost=0.00..4.82 rows=1 width=22)
-> Subquery Scan *SELECT* 3
(cost=0.00..26.79 rows=1 width=46)
-> Nested Loop (cost=0.00..26.79
rows=1 width=46)
-> Nested Loop (cost=0.00..21.95
rows=1 width=24)
-> Index Scan using
keyword_keyword_ix on keyword k (cost=0.00..4.82 rows=1 width=8)
-> Index Scan using
image_keyword_keyword_id_ix on image_keyword ik (cost=0.00..17.07
rows=5 width=16)
-> Index Scan using image_pkey on
image i (cost=0.00..4.82 rows=1 width=22)
EXPLAIN

master_reference
------------------
0079-00047
0219-00166
0237-00131
0237-00140
0237-00146
0244-00037
0244-00038
0244-00046
0244-00055
0244-00067
0253-00004
0368-00094
(12 rows)

OS X (PG 7.2) - union end-snip

+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

RH 7.1 (PG 7.1.3) - union start-snip :

spdb=# explain select i.master_reference from image i, image_keyword ik,
keyword k
spdb-# where k.keyword = 'pen'
spdb-# and ik.keyword_id = k.keyword_id
spdb-# and i.image_id = ik.image_id
spdb-# union
spdb-# select i.master_reference from image i, image_keyword ik,
keyword k
spdb-# where k.keyword = 'cheque'
spdb-# and ik.keyword_id = k.keyword_id
spdb-# and i.image_id = ik.image_id
spdb-# union
spdb-# select i.master_reference from image i, image_keyword ik,
keyword k
spdb-# where k.keyword = 'purchase'
spdb-# and ik.keyword_id = k.keyword_id
spdb-# and i.image_id = ik.image_id
spdb-# order by master_reference
spdb-# limit 12, 0;
NOTICE: QUERY PLAN:
Limit (cost=661.13..661.13 rows=12 width=44)
-> Sort (cost=661.13..661.13 rows=300 width=44)
-> Unique (cost=641.29..648.79 rows=300 width=44)
-> Sort (cost=641.29..641.29 rows=3000 width=44)
-> Append (cost=83.25..468.02 rows=3000 width=44)
-> Subquery Scan *SELECT* 1
(cost=83.25..156.01 rows=1000 width=44)
-> Merge Join (cost=83.25..156.01
rows=1000 width=44)
-> Index Scan using image_pkey on
image i (cost=0.00..59.00 rows=1000 width=20)
-> Sort (cost=83.25..83.25
rows=100 width=24)
-> Merge Join
(cost=8.30..79.93 rows=100 width=24)
-> Index Scan using
image_keyword_keyword_id_ix on image_keyword ik (cost=0.00..59.00
rows=1000 width=16)
-> Sort
(cost=8.30..8.30 rows=10 width=8)
-> Index Scan
using keyword_keyword_ix on keyword k (cost=0.00..8.14 rows=10 width=8)
-> Subquery Scan *SELECT* 2
(cost=83.25..156.01 rows=1000 width=44)
-> Merge Join (cost=83.25..156.01
rows=1000 width=44)
-> Index Scan using image_pkey on
image i (cost=0.00..59.00 rows=1000 width=20)
-> Sort (cost=83.25..83.25
rows=100 width=24)
-> Merge Join
(cost=8.30..79.93 rows=100 width=24)
-> Index Scan using
image_keyword_keyword_id_ix on image_keyword ik (cost=0.00..59.00
rows=1000 width=16)
-> Sort
(cost=8.30..8.30 rows=10 width=8)
-> Index Scan
using keyword_keyword_ix on keyword k (cost=0.00..8.14 rows=10 width=8)
-> Subquery Scan *SELECT* 3
(cost=83.25..156.01 rows=1000 width=44)
-> Merge Join (cost=83.25..156.01
rows=1000 width=44)
-> Index Scan using image_pkey on
image i (cost=0.00..59.00 rows=1000 width=20)
-> Sort (cost=83.25..83.25
rows=100 width=24)
-> Merge Join
(cost=8.30..79.93 rows=100 width=24)
-> Index Scan using
image_keyword_keyword_id_ix on image_keyword ik (cost=0.00..59.00
rows=1000 width=16)
-> Sort
(cost=8.30..8.30 rows=10 width=8)
-> Index Scan
using keyword_keyword_ix on keyword k (cost=0.00..8.14 rows=10 width=8)

EXPLAIN

master_reference
------------------
0079-00047
0219-00166
0237-00131
0237-00140
0237-00146
0244-00037
0244-00038
0244-00046
0244-00055
0244-00067
0253-00004
0368-00094
(12 rows)

RH 7.1 (PG 7.1.3) - union end-snip

+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

As a side issue, I have tried to upgrade the RH 7.1 box with PG 7.2, but
I get the following which I cannot get passed (apologies as this is
clearly a Linux issue ... but it is very frustrating and have hit a
brick wall; I have installed 'openssl-0.9.6-9.i386.rpm' and
'krb5-libs-1.2.2-12.i386.rpm' but no change in behaviour) :

rpm -i --test postgresql-7.2-1PGDG.i686.rpm
error: failed dependencies:
libcrypto.so.0 is needed by postgresql-7.2-1PGDG
libk5crypto.so.2 is needed by postgresql-7.2-1PGDG
libkrb5.so.2 is needed by postgresql-7.2-1PGDG
libreadline.so.4 is needed by postgresql-7.2-1PGDG
libssl.so.0 is needed by postgresql-7.2-1PGDG

Any advice appreciated ;-)

Regards.

=====

On Friday, April 5, 2002, at 11:50 pm, Tom Lane wrote:

> d_nardini(at)btconnect(dot)com writes:
>> I'm running the following statement on a (development) Mac OS X (PG
>> 7.2)
>> box and on a (production) RedHat 7.0 (PG 7.1.3) box. On the OS X it's
>> performing very well ... on RH it's VERY SLOW, with CPU utilization
>> hitting 90%+ (on OS X it hardly exceeds 2%).
>
> Hmm. Nested intersects are actually broken in 7.1.*, cf.
> http://archives.postgresql.org/pgsql-bugs/2001-08/msg00064.php
> Do you get the same results from both boxes?
>
>> BTW - replacing 'intersect'
>> with 'union' in the SQL behaves the same.
>
> In that case it might possibly be a different problem. Do you get the
> same EXPLAIN plan on both boxes?
>
> regards, tom lane
>
> ---------------------------(end of broadcast)---------------------------
> TIP 2: you can get off all lists at once with the unregister command
> (send "unregister YourEmailAddressHere" to majordomo(at)postgresql(dot)org)
>

In response to

Responses

Browse pgsql-sql by date

  From Date Subject
Next Message Tom Lane 2002-04-06 15:45:35 Re: intersect performance (PG 7.1.3 vs 7.2)
Previous Message Barry Lind 2002-04-06 05:35:49 Re: 16 parameter limit