Re: Overhead of union versus union all

From: Simon Riggs <simon(at)2ndQuadrant(dot)com>
To: Bruce Momjian <bruce(at)momjian(dot)us>
Cc: Scott Marlowe <scott(dot)marlowe(at)gmail(dot)com>, Scott Bailey <artacus(at)comcast(dot)net>, pgsql-general(at)postgresql(dot)org
Subject: Re: Overhead of union versus union all
Date: 2009-07-10 13:40:47
Message-ID: 1247233247.11347.587.camel@ebony.2ndQuadrant
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general


On Fri, 2009-07-10 at 09:28 -0400, Bruce Momjian wrote:
> Simon Riggs wrote:
> >
> > On Fri, 2009-07-10 at 08:59 -0400, Bruce Momjian wrote:
> >
> > > > I think it should be possible to use predtest theorem proving to
> > > discard
> > > > the sort/hash step in cases where we can prove the sets are
> > > disjoint.
> > > > Often there are top-level quals that can be compared in the WHERE
> > > > clauses of the sub-queries, so a shallow search could be quite
> > > > profitable in allowing us to rewrite a UNION into a UNION ALL.
> > >
> > > I assume we would still need the distinct removal step; we just avoid
> > > the sort/hash.
> >
> > I mean it seems possible to prove that the distinct removal step is not
> > necessary, by proving that the various sub-queries are already disjoint.
> > It's a common manual optimization, so automating it seems a reasonable
> > future goal.
>
> I am confused what sub-queries produce _distinct_ output. I know there
> are some that produce _ordered_ output.

None, that was not my point.

If you have a query like this

Select ..., status, ...
...
where status = '1'
union
Select ..., status, ...
...
where status = '2';

or a query like this

Select '1', ...
...
union
Select '2', ...
...
;

or a query like this

Select '1', ...
...
union
Select status, ...
...
where status != '1';
;

then it is clear that we could automatically prove that the the distinct
step is redundant and so we could either hash or sort. This is the same
as replacing the UNION with UNION ALL.

--
Simon Riggs www.2ndQuadrant.com
PostgreSQL Training, Services and Support

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Bruce Momjian 2009-07-10 13:46:23 Re: Overhead of union versus union all
Previous Message nabble.30.miller_2555 2009-07-10 13:35:48 Re: Database storage