Re: BUG #14136: select distinct from a materialized view does not preserve result order

From: Sean Knutson <seandknutson(at)gmail(dot)com>
To: Kevin Grittner <kgrittn(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "pgsql-bugs(at)postgresql(dot)org" <pgsql-bugs(at)postgresql(dot)org>
Subject: Re: BUG #14136: select distinct from a materialized view does not preserve result order
Date: 2016-05-14 00:29:18
Message-ID: CAPhT6UjOTx-v2YORGc9gfKRrXcN=DwJCc6G1Vr3NPRGbziEb9Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

> It is bad technique to assume an order from a query that is not
> explicitly specified in that query, regardless of whether the data
> is being drawn from a table, view, or materialized view. A
> DISTINCT or GROUP BY clause should not be assumed to order the
> output -- it might sometimes do so, if that is the fastest way to
> group things; but there is no requirement that it do so.

Thanks for the quick replies! Yeah, that makes sense. I think at first I
blindly assumed that it would preserve the order of the result set when
doing a DISTINCT, although it makes sense why it wouldn't (or at least that
you can't count on it). I think the bigger point of confusion for me (and
the reason I felt it worth reporting as a bug) was why the DISTINCT would
behave differently when selecting from a mat view or table vs from a view
or subquery, and it sounds like that's simply a result of which algorithm
the query planner chooses to evaluate the DISTINCT in each situation?

> I agree that it should be documented. I'll do something about that.

Thanks, I think that will help a great deal!

All the best,
-Sean

On Fri, May 13, 2016 at 7:33 PM, Kevin Grittner <kgrittn(at)gmail(dot)com> wrote:

> On Fri, May 13, 2016 at 5:17 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> > seandknutson(at)gmail(dot)com writes:
>
> >> and a materialized view defined as
> >> create materialized view view_test as (select * from test order by
> name);
> >
> > AFAIK, a matview is just a table, and even if you define it like that
> > there's no guarantee that the rows in the matview will be physically
> > ordered by name.
>
> > This seems like a documentation shortcoming, though, as the documentation
> > doesn't really say that a matview won't preserve row order (or at least
> > I see nothing about it in the CREATE MATERIALIZED VIEW ref page --- maybe
> > it's mentioned somewhere else?)
>
> I agree that it should be documented. I'll do something about that.
>
> Note that even if the table is initially created in order, there is
> no guarantee that REFRESH CONCURRENTLY will leave it in order. (In
> that regard it's more-or-less like CLUSTER.) And even if you query
> it while it is in order, there is no guarantee that this is the
> order in which the rows would be returned, as there could be a
> synchronous scan or an index-only scan on an index in some other
> order.
>
> >> If I have a non-materialized view with the same definition,
> >> the query returns
> >>
> >> name
> >> -------
> >> adam
> >> jim
> >> sean
> >> steve
> >>
> >> as expected.
>
> It is bad technique to assume an order from a query that is not
> explicitly specified in that query, regardless of whether the data
> is being drawn from a table, view, or materialized view. A
> DISTINCT or GROUP BY clause should not be assumed to order the
> output -- it might sometimes do so, if that is the fastest way to
> group things; but there is no requirement that it do so.
>
> --
> Kevin Grittner
> EDB: http://www.enterprisedb.com
> The Enterprise PostgreSQL Company
>

In response to

Browse pgsql-bugs by date

  From Date Subject
Next Message Christophe Pettus 2016-05-14 02:44:26 pg_get_viewdef() drops casts, causing broken definitions
Previous Message Kevin Grittner 2016-05-13 23:33:30 Re: BUG #14136: select distinct from a materialized view does not preserve result order