From: | Ibrar Ahmed <ibrar(dot)ahmad(at)gmail(dot)com> |
---|---|
To: | 066ce286(at)free(dot)fr |
Cc: | Mark Pasterkamp <markpasterkamp1994(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Subject: | Re: UNION ALL |
Date: | 2019-08-15 19:21:17 |
Message-ID: | CALtqXTfMFEte-JDcAJPsYW1TU2CJdECNM5mHHtgn4C9fEtwEpw@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Fri, Aug 16, 2019 at 12:16 AM <066ce286(at)free(dot)fr> wrote:
> Generally speaking, when executing UNION ; a DISTINCT is run afterward on
> the resultset.
>
> So, if you're sure that each part of UNION cannot return a line returned
> by another one, you may use UNION ALL, you'll cut the cost of the final
> implicit DISTINCT.
>
>
> ----- Mail original -----
> De: "Mark Pasterkamp" <markpasterkamp1994(at)gmail(dot)com>
> À: pgsql-hackers(at)lists(dot)postgresql(dot)org
> Envoyé: Jeudi 15 Août 2019 20:37:06
> Objet: UNION ALL
>
>
> Dear all,
>
>
> I was wondering if someone could help me understands what a union all
> actually does.
>
>
> For my thesis I am using Apache Calcite to rewrite queries into using
> materialized views which I then give to a Postgres database.
> For some queries, this means that they will be rewritten in a UNION ALL
> style query between an expression and a table scan of a materialized view.
> However, contrary to what I expected, the UNION ALL query is actually a
> lot slower.
>
>
> As an example, say I have 2 tables: actor and movie. Furthermore, there is
> also a foreign key index on movie to actor.
> I also have a materialized view with the join of these 2 tables for all
> movies <= 2015 called A.
> Now, if I want to query all entries in the join between actor and movie, I
> would assume that a UNION ALL between the join of actor and movie for
> movies >2015 and A is faster than executing the original query..
> If I look at the explain analyze part, I can certainly see a reduction in
> cost up until the UNION ALL part, which carries a respective cost more than
> negating the cost reduction up to a point where I might as well not use the
> existing materialized view.
>
>
> I have some trouble understanding this phenomenon.
> One thought which came to my mind was that perhaps UNION ALL might create
> a temporary table containing both result sets, and then do a table scan and
> return that result.
>
> this would greatly increase IO cost which could attribute to the problem.
> However, I am really not sure what UNION ALL actually does to append both
> result sets so I was wondering if someone would be able to help me out with
> this.
>
>
>
>
> Mark
>
>
> 066ce286(at)free(dot)fr: Please, avoid top-posting. It makes harder to follow
the
discussion.
--
Ibrar Ahmed
From | Date | Subject | |
---|---|---|---|
Next Message | Tomas Vondra | 2019-08-15 20:53:51 | Re: Extension development |
Previous Message | 066ce286 | 2019-08-15 19:15:50 | Re: UNION ALL |