Re: optimising UNION performance

From: Rafal Pietrak <rafal(at)zorro(dot)isa-geek(dot)com>
To: pgsql-general(at)postgresql(dot)org
Subject: Re: optimising UNION performance
Date: 2006-08-28 12:15:22
Message-ID: 1156767322.6725.37.camel@zorro.isa-geek.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Mon, 2006-08-28 at 13:04 +0200, Alban Hertroys wrote:
> Rafal Pietrak wrote:
> > Total runtime: 822.901 ms
> > (7 rows)
> > -----------------------------------------
>
> Just to make sure: You do have an appropriate index over the tables in
> that UNION?

Well. The logfiles don't have their own indexes but make foreign key
references over brand1/brand2/clty columns. Unique constreins are on the
target tables.

But that particual ANALYSED query was: "SELECT 1 FROM comlog".

So, the use of seq-scan looks quite adequate to me (I haven't posted
results to avoid cuttering of my original query with too many details,
but the ANALYSE of "SELECT * FROM comlog" gives almost exactly the same
cost and time, and *that* is what I will actually be doing in the
application).

So: "SELECT 1" on comlog costs 830ms and is done by two saq-scans.

but: "SELECT 1" on log1 gives 120ms, and "SELECT 1" on log2 gives 80ms.

All three queries are executed as seq-scans.... which look OK, as I
intend to fetch *all* the rows.

And yet, there is this 600ms 'leak'.

Now, as I try to read the ANALYSE output (which I'm not very proficient
at - just blindly comparing them): for queries of raw logs (log1/log2),
I can see just one server task: "seq-scan" for each respective query.
But when I look at ANALYSE output of comlog SELECT, I can see, that:
1. the seq-scans is more expensive here: 170ms and 120ms respectively.
Any reasons for that?
2. each scan has an additional job of: Subquery Scan "*SELECT* 1" ...
which costs even more (280ms and 230ms respectively), although it's
purpose it not very clear to me.
3. only on top of that, there is an Append process, which looks cheap as
expected.

So I can risk an opinion, that I can seek the missing 600ms, I only
don't understand why it's there.

Or to put it the other way around: is there a way to write a UNION where
the 200ms to 800ms cost increase does not occure.

-R

> From experience, it seems that PostgreSQL chooses a sequential scan
> over unioned sets instead of an index scan - the details escape me, but
> there is a good reason for that. I'm sure it's not for performance
> reasons, though.
>
> There have been some discussions about inheritance performance, which
> boils down to exactly this problem (inheritance basically is a UNION
> over all the tables involved). You may want to check the archives.
>
> Regards,
--
-R

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Bobby Gontarski 2006-08-28 12:38:07 pg_restore problems
Previous Message Alban Hertroys 2006-08-28 11:04:45 Re: optimising UNION performance