Re: [NEWBIE] need help optimizing this query

From: Dexter Tad-y <dexterbt1(at)my(dot)smart(dot)com(dot)ph>
To: PostgreSQL <pgsql-general(at)postgresql(dot)org>
Subject: Re: [NEWBIE] need help optimizing this query
Date: 2004-03-10 15:34:52
Message-ID: 1078932892.2833.53.camel@localhost
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Wed, 2004-03-10 at 23:08, Stephan Szabo wrote:
> On Wed, 10 Mar 2004, Dexter Tad-y wrote:
>
> > Greetings,
> > I need help in optimizing this query:
> >
> > select a.id, b.transaction from test as a left join pg_locks as b on
> > a.xmax = b.transaction where b.transaction is null;
> >
> > im using the query in obtaining records not locked by other
> > transactions. any help is appreciated.
>
> It's hard to say without knowing more about the size of a and explain
> analyze output. On my 7.4 machine, using NOT IN rather than the left join
> gives about a 2x speed increase on a 400k row table.

here's what comes up with explain:

1) using LEFT JOIN

csp=> explain select a.id, b.transaction from test as a left join
pg_locks as b on a.xmax = b.transaction having transaction is null;
QUERY PLAN
-------------------------------------------------------------------------------------
Hash Left Join (cost=15.00..340.01 rows=1000 width=12)
Hash Cond: ("outer".xmax = "inner"."transaction")
Filter: ("inner"."transaction" IS NULL)
-> Seq Scan on test a (cost=0.00..20.00 rows=1000 width=12)
-> Hash (cost=12.50..12.50 rows=1000 width=4)
-> Function Scan on pg_lock_status l (cost=0.00..12.50
rows=1000 width=4)
(6 rows)

2) using NOT IN

csp=> explain select * from test where id not in (select test.id from
test, pg_locks where pg_locks.transaction=test.xmax);
QUERY PLAN
---------------------------------------------------------------------------------------------
Seq Scan on test (cost=352.51..375.01 rows=500 width=32)
Filter: (NOT (hashed subplan))
SubPlan
-> Hash Join (cost=15.00..340.01 rows=5001 width=8)
Hash Cond: ("outer".xmax = "inner"."transaction")
-> Seq Scan on test (cost=0.00..20.00 rows=1000 width=12)
-> Hash (cost=12.50..12.50 rows=1000 width=4)
-> Function Scan on pg_lock_status l
(cost=0.00..12.50 rows=1000 width=4)
(8 rows)

which of the two is faster? :D

cheers!

Dexter Tad-y

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Dexter Tad-y 2004-03-10 15:51:45 Re: [NEWBIE] need help optimizing this query
Previous Message wespvp 2004-03-10 15:33:42 More Deadlock Detection on Insert