Quick Links

Join question

From:	tyrrill_ed(at)emc(dot)com
To:	<pgsql-sql(at)postgresql(dot)org>
Subject:	Join question
Date:	2007-08-21 17:48:18
Message-ID:	5C7C0B0734F87445AFC8B63EDCC4901E015ED7EE@CORPUSMX60C.corp.emc.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-sql

Hey All,

I have a query I'm trying to speed up, and I was hoping someone could
help me. I have a three tables a and b hold data, and c just references
between a and b:

create table a (
a_id int,
x int
);

create table b (
b_id int,
x int
);

create table c (
a_id int,
b_id int
);

I am doing a query like this:

SELECT a.x, max(b.x) FROM a, b, c WHERE a.a_id = c.a_id AND b.b_id =
c.b_id GROUP by a.x;

I only need to get one row from b for each row in a, and it really
doesn't matter which one. I use max() to get a single value from table
b. There are generally be dozens to hundreds of rows in b for each row
in a. The problem is when I have a query with tens of thousands of rows
in a that the join with b will have millions of rows, and is really
slow. The group by effectively reduces the results down to what I want,
but it still has to process the millions of rows. Does anyone know a
way I could restructure this query to get only one b for each a in a
faster way?

Thanks,
Ed Tyrrill

Responses

Re: Join question at 2007-08-21 19:16:40 from Richard Broersma Jr
Re: Join question at 2007-08-21 21:16:35 from Michael Glaesemann

Browse pgsql-sql by date

	From	Date	Subject
Next Message	Richard Broersma Jr	2007-08-21 19:16:40	Re: Join question
Previous Message	Richard Huxton	2007-08-21 13:23:12	Re: wrong answer