Re: JOIN issues (Left vs Right for sorting), and "Nested Loop" problem

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: "Phoenix Kiula" <phoenix(dot)kiula(at)gmail(dot)com>
Cc: "Alban Hertroys" <alban(at)magproductions(dot)nl>, "Postgres General" <pgsql-general(at)postgresql(dot)org>
Subject: Re: JOIN issues (Left vs Right for sorting), and "Nested Loop" problem
Date: 2007-09-01 15:23:27
Message-ID: 16772.1188660207@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

"Phoenix Kiula" <phoenix(dot)kiula(at)gmail(dot)com> writes:
> On 01/09/07, Alban Hertroys <alban(at)magproductions(dot)nl> wrote:
>> Is 10 ms problematic for this query?

> I think you got 10ms from the query plan? These queries are very fast
> after they have been executed once. But the first time is huge.
> Sometimes I have to wait as much as 10 seconds (10,000ms?)

It's to be expected that repeating the same query would be faster, since
all the data will have been pulled from disk and be sitting in cache.
In this query you're fetching about 700 rows from random locations on
the disk, so if none of them are in memory already there's likely to be
700 seeks done. Seek times in the range of 10ms are not unusual for
cheap disks ... you do the math.

Solutions include buying faster disks, or buying more RAM so more of
your data can stay in cache. If your queries are very stylized (like
always using the same index) then you might get somewhere by CLUSTERing
on that index to reduce the number of seeks needed, but this is seldom
a solution that fixes everything.

>>> So my question: how can I do a left
>>> join, which is the logic that I wish to accomplish, but get the
>>> sorting to work from the second table and if a column is null then
>>> just return as 0 instead of nothing at all? (The LEFT JOIN used to
>>> work in Mysql).

>> You could use ORDER BY COALESCE(tradecount.u_count, 0) desc if you
>> want it to behave like you say mysql sorted it.

I got curious about this assertion and went to check it. AFAICT mysql
doesn't have any weird automatic coalesce involved in sorting. The
difference is that they sort nulls first, rather than last as we do:

mysql> select * from t1 left join t2 using(f1) order by t2.f2 ;
+----+------+
| f1 | f2 |
+----+------+
| 3 | NULL |
| 1 | 11 |
| 2 | 22 |
+----+------+
3 rows in set (0.00 sec)

mysql> select * from t1 left join t2 using(f1) order by t2.f2 desc;
+----+------+
| f1 | f2 |
+----+------+
| 2 | 22 |
| 1 | 11 |
| 3 | NULL |
+----+------+
3 rows in set (0.00 sec)

Same data in PG yields:

regression=# select * from t1 left join t2 using(f1) order by t2.f2 ;
f1 | f2
----+----
1 | 11
2 | 22
3 |
(3 rows)

regression=# select * from t1 left join t2 using(f1) order by t2.f2 desc;
f1 | f2
----+----
3 |
2 | 22
1 | 11
(3 rows)

Both behaviors are legal per spec (it's "implementation defined"
which is the ordering, according to the SQL standard).

As of PG 8.3 there will be NULLS FIRST and NULLS LAST options so that
you can get either ordering, but no released version has these:

regression=# select * from t1 left join t2 using(f1) order by t2.f2 nulls first;
f1 | f2
----+----
3 |
1 | 11
2 | 22
(3 rows)

regards, tom lane

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Richard Broersma Jr 2007-09-01 15:26:18 Re: Export data to MS Excel
Previous Message Alban Hertroys 2007-09-01 15:19:24 Re: JOIN issues (Left vs Right for sorting), and "Nested Loop" problem