Re: improving performance of UNION and ORDER BY

From: Chris Gamache <cgg007(at)yahoo(dot)com>
To: Jean-Luc Lachance <jllachan(at)nsd(dot)ca>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: improving performance of UNION and ORDER BY
Date: 2002-03-04 14:21:11
Message-ID: 20020304142111.21267.qmail@web13807.mail.yahoo.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

New Query...

(select
a.username as "User",
a.trans_date as "Date",
tl.longtype as "Type",
a.trans_data as "Query Data",
a.trans_charge as "Charged",
a.user_reference_id as "Reference ID"
from a_trans_log a, addtypelong tl
where (
(tl.shorttype=a.trans_type) and
(a.trans_date >= '12/31/01'::TIMESTAMP)
) order by 4 desc, 2 limit 20)
union
(select
b.username as "User",
b.trans_date as "Date",
tl.longtype as "Type",
b.trans_data as "Query Data",
b.trans_charge as "Charged",
b.user_reference_id as "Reference ID"
from b_trans_log b, addtypelong tl
where (
(tl.shorttype=b.trans_type) and
(b.trans_date >= '12/31/01'::TIMESTAMP)
) order by 4 desc, 2 limit 20)
order by 4 desc, 2 limit 20;

Limit (cost=12674.47..12674.47 rows=4 width=84)
-> Sort (cost=12674.47..12674.47 rows=4 width=84)
-> Unique (cost=12673.83..12674.43 rows=4 width=84)
-> Sort (cost=12673.83..12673.83 rows=40 width=84)
-> Append (cost=12055.58..12672.77 rows=40 width=84)
-> Subquery Scan *SELECT* 1
(cost=12055.58..12055.58 rows=20 width=84)
-> Limit (cost=12055.58..12055.58 rows=20
width=84)
-> Sort (cost=12055.58..12055.58
rows=23724 width=84)
-> Hash Join (cost=1.20..9674.89
rows=23724 width=84)
-> Seq Scan on a_trans_log a
(cost=0.00..8695.30 rows=24455 width=60)
-> Hash (cost=1.16..1.16
rows=16 width=24)
-> Seq Scan on
addtypelong tl (cost=0.00..1.16 rows=16 width=24)
-> Subquery Scan *SELECT* 2 (cost=617.19..617.19
rows=20 width=84)
-> Limit (cost=617.19..617.19 rows=20
width=84)
-> Sort (cost=617.19..617.19 rows=2462
width=84)
-> Hash Join (cost=1.20..478.50
rows=2462 width=84)
-> Seq Scan on b_trans_log b
(cost=0.00..378.61 rows=2462 width=60)
-> Hash (cost=1.16..1.16
rows=16 width=24)
-> Seq Scan on
addtypelong tl (cost=0.00..1.16 rows=16 width=24)

That cut the query down to 6 seconds. About 5 seconds longer than I would like
it. I'll take any performance increase, tho. I tried SET enable_seqscan=off;
And it takes 8 seconds. Strange indeed.

The reason I don't put the two tables together is that the tables don't have
the exact same structure... I suppose I could merge the two, but it would be
counterintuitive to logically apply some fields in table a to data collected
for table b. Plus, I do so many more operations on the single tables than the
joined tables that it wouldn't make good use of programming time to rewrite all
the single table ops.

I know... excuses, excuses! Thanks for the input. Any more ideas?

CG

--- Jean-Luc Lachance <jllachan(at)nsd(dot)ca> wrote:
> Chris,
>
> I believe you can cut down the processing time by including the order by
> and limit in each select:
>
> (SELECT ... ORDER BY 2 DESC, 4 LIMIT 20)
> union
> (SELECT ... ORDER BY 2 DESC, 4 LIMIT 20)
> ORDER BY 2 DESC, 4 LIMIT 20
>
> Give it a try and let me know
>
>
> JLL
>
> P.S.
>
> Question: Why not have a single table with a field log_type in ( 'A',
> 'B') ???
>
>
>
> Chris Gamache wrote:
> >
> > Three tables...
> > [...]
> >
> > I imagine the combination of UNION and ORDER BY causes the problem, since
> > Postgres has to locate all the rows that match the search criteria, merge
> them,
> > order them, then return the top 20...
> >
> > Any suggestions? Did I forget to provide any data that would make things
> > clearer?
>
> ---------------------------(end of broadcast)---------------------------
> TIP 2: you can get off all lists at once with the unregister command
> (send "unregister YourEmailAddressHere" to majordomo(at)postgresql(dot)org)

__________________________________________________
Do You Yahoo!?
Yahoo! Sports - sign up for Fantasy Baseball
http://sports.yahoo.com

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Charles 2002-03-04 14:25:38 backend closed
Previous Message Ioannis 2002-03-04 14:07:58 Re: postgre performance question