Re: Any better plan for this query?..

From: Craig Ringer <craig(at)postnewspapers(dot)com(dot)au>
To: Dimitri <dimitrik(dot)fr(at)gmail(dot)com>
Cc: PostgreSQL Performance <pgsql-performance(at)postgresql(dot)org>
Subject: Re: Any better plan for this query?..
Date: 2009-05-06 08:01:03
Message-ID: 4A0143BF.4020103@postnewspapers.com.au
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Dimitri wrote:
> Hi,
>
> any idea if there is a more optimal execution plan possible for this query:
>
> select S.REF as stref, S.NAME as stnm, H.HORDER as hord, H.BEGIN_DATE as hbeg,
> H.END_DATE as hend, H.NOTE as hnote
> from HISTORY H, STAT S
> where S.REF = H.REF_STAT
> and H.REF_OBJECT = '0000000001'
> order by H.HORDER ;

OK, so you're taking a simple:

history INNER JOIN stat ON (stat.ref = history.ref_stat)

then filtering for records with a particular value of history.ref_object
and finally performing a sort.

If I'm reading it right, the plan below does a sequential scan on the
`stat' table. The stat table only has 1000 rows, so this isn't
necessarily an unreasonable choice even if there is an appropriate index
and even if not many of the rows will be needed.

It then does an index scan of the history table looking for tuples with
ref_object = '0000000001' (text match). It hash joins the hashed results
of the initial seq scan to the results of the index scan, and sorts the
result.

To me, that looks pretty reasonable. You might be able to avoid the hash
join in favour of a nested loop scan of stat_ref_idx (looping over
records from history.ref_stat where ref_object = '00000000001') by
providing a composite index on HISTORY(ref_stat, ref_object). I'm really
not too sure, though; plan optimization isn't my thing, I'm just seeing
if I can offer a few ideas.

> Table definitions:

While not strictly necessary, it's a *REALLY* good idea to define a
suitable PRIMARY KEY.

Also, the `CHAR(n)' data type is evil. E.V.I.L. Use `varchar(n)' for
bounded-length values, or `text' for unbounded fields, unless you REALLY
want the crazy behaviour of `CHAR(n)'.

I'm a little bit puzzled about why you seem to be doing lots of things
with integer values stored in text strings, but that probably doesn't
matter too much for the issue at hand.

> NOTE: The same query runs 2 times faster on MySQL.

With InnoDB tables and proper transactional safety? Or using scary
MyISAM tables and a "just pray" approach to data integrity? If you're
using MyISAM tables I'm not surprised; MySQL with MyISAM is stunningly
fast, but oh-my-god dangerous.

--
Craig Ringer

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Dimitri 2009-05-06 08:14:38 Re: Any better plan for this query?..
Previous Message Dimitri 2009-05-06 07:38:59 Any better plan for this query?..