Quick Links

Re: PostgreSQL runs a query much slower than BDE and MySQL

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	"Peter Hardman" <peter(at)ssbg(dot)zetnet(dot)co(dot)uk>
Cc:	pgsql-performance(at)postgresql(dot)org
Subject:	Re: PostgreSQL runs a query much slower than BDE and MySQL
Date:	2006-08-16 22:51:57
Message-ID:	29246.1155768717@sss.pgh.pa.us
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-performance

"Peter Hardman" <peter(at)ssbg(dot)zetnet(dot)co(dot)uk> writes:
> I'm in the process of migrating a Paradox 7/BDE 5.01 database from single-user
> Paradox to a web based interface to either MySQL or PostgreSQL.
> The query I run is:

> /* Select all sheep who's most recent transfer was into the subject flock */
> SELECT DISTINCT f1.regn_no, f1.transfer_date as date_in
> FROM SHEEP_FLOCK f1 JOIN
> /* The last transfer date for each sheep */
> (SELECT f.regn_no, MAX(f.transfer_date) as last_xfer_date
> FROM SHEEP_FLOCK f
> GROUP BY f.regn_no) f2
> ON f1.regn_no = f2.regn_no
> WHERE f1.flock_no = '1359'
> AND f1.transfer_date = f2.last_xfer_date

This seems pretty closely related to this recent thread:
http://archives.postgresql.org/pgsql-performance/2006-08/msg00220.php
in which the OP is doing a very similar kind of query in almost exactly
the same way.

I can't help thinking that there's probably a better way to phrase this
type of query in SQL, though it's not jumping out at me what that is.

What I find interesting though is that it sounds like both MSSQL and
Paradox know something we don't about how to optimize it. PG doesn't
have any idea how to do the above query without forming the full output
of the sub-select, but I suspect that the commercial DBs know a
shortcut; perhaps they are able to automatically derive a restriction
in the subquery similar to what you did by hand. Does Paradox have
anything comparable to EXPLAIN that would give a hint about the query
plan they are using?

Also, just as in the other thread, I'm thinking that a seqscan+hash
aggregate would be a better idea than this bit:

> -> GroupAggregate (cost=0.00..3924.91 rows=33676 width=13) (actual time=0.324..473.131 rows=38815 loops=1)
> -> Index Scan using sheep_flock_pkey on sheep_flock f (cost=0.00..3094.95 rows=81802 width=13) (actual time=0.295..232.156)

Possibly you need to raise work_mem to get it to consider the hash
aggregation method.

BTW, are you *sure* you are testing PG 8.1? The "Subquery Scan f2" plan
node looks unnecessary to me, and I'd have expected 8.1 to drop it out.
8.0 and before would have left it in the plan though. This doesn't make
all that much difference performance-wise in itself, but it does make me
wonder what you are testing.

regards, tom lane

In response to

PostgreSQL runs a query much slower than BDE and MySQL at 2006-08-16 16:48:13 from Peter Hardman

Responses

Re: PostgreSQL runs a query much slower than BDE and MySQL at 2006-08-17 09:21:01 from Peter Hardman
Re: PostgreSQL runs a query much slower than BDE and MySQL at 2006-08-17 19:09:45 from Scott Lamb

Browse pgsql-performance by date

	From	Date	Subject
Next Message	Steve Poe	2006-08-17 02:10:33	Re: Postgresql Performance on an HP DL385 and
Previous Message	Peter Hardman	2006-08-16 19:42:49	Re: PostgreSQL runs a query much slower than BDE and MySQL