| From: | tyrrill_ed(at)emc(dot)com | 
|---|---|
| To: | <pgsql-sql(at)postgresql(dot)org> | 
| Subject: | Join question | 
| Date: | 2007-08-21 17:48:18 | 
| Message-ID: | 5C7C0B0734F87445AFC8B63EDCC4901E015ED7EE@CORPUSMX60C.corp.emc.com | 
| Views: | Whole Thread | Raw Message | Download mbox | Resend email | 
| Thread: | |
| Lists: | pgsql-sql | 
Hey All,
I have a query I'm trying to speed up, and I was hoping someone could
help me.  I have a three tables a and b hold data, and c just references
between a and b:
create table a (
   a_id int,
   x int
);
create table b (
   b_id int,
   x int
);
create table c (
   a_id int,
   b_id int
);
I am doing a query like this:
SELECT a.x, max(b.x) FROM a, b, c WHERE a.a_id = c.a_id AND b.b_id =
c.b_id GROUP by a.x;
I only need to get one row from b for each row in a, and it really
doesn't matter which one.  I use max() to get a single value from table
b.  There are generally be dozens to hundreds of rows in b for each row
in a.  The problem is when I have a query with tens of thousands of rows
in a that the join with b will have millions of rows, and is really
slow.  The group by effectively reduces the results down to what I want,
but it still has to process the millions of rows.  Does anyone know a
way I could restructure this query to get only one b for each a in a
faster way?
Thanks,
Ed Tyrrill
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Richard Broersma Jr | 2007-08-21 19:16:40 | Re: Join question | 
| Previous Message | Richard Huxton | 2007-08-21 13:23:12 | Re: wrong answer |