Re: Result set stability in queries sorted by column with non-unique values

From: "Podrigal, Aron" <aronp(at)guaranteedplus(dot)com>
To: Michele Bosi <michele(dot)bosi(at)gmail(dot)com>
Cc: pgsql-in-general(at)postgresql(dot)org
Subject: Re: Result set stability in queries sorted by column with non-unique values
Date: 2015-04-16 01:34:47
Message-ID: CANJp-yjxbei8iwAMXJsp04Q2X_1iGzpapkLxnu3X7dYsrQ0DwQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-in-general

I think this stack answer would give you some more info
http://dba.stackexchange.com/a/38718
On Apr 15, 2015 1:32 PM, "Michele Bosi" <michele(dot)bosi(at)gmail(dot)com> wrote:

> That's what I heard but I could not find any convincing reference in
> the docs or elsewhere, do you know of any?
>
> On 15 April 2015 at 17:54, Podrigal, Aron <aronp(at)guaranteedplus(dot)com>
> wrote:
> > I think the non-unique cols are usually returned in the order they're
> > stored on disk.
> >
> >
> > Aron
> >
> > On 4/15/15, Michele Bosi <michele(dot)bosi(at)gmail(dot)com> wrote:
> >> Thanks Niranjan,
> >>
> >> In our case we require the rows returned to be in the same consistent
> >> order across several queries as we need to paginate through a large
> >> set of results by doing a combination of ORDER BY x, LIMIT, OFFSET.
> >> However the sorting key "x" can contain non-unique values, thus we
> >> need to be sure that rows with the same values end up in the same
> >> position across the different queries we do per each result "page".
> >>
> >> As I understand, unless we add a secondary sorting key (like the "id"
> >> of the row) which is guaranteed to be unique, we might end up with
> >> rows that can potentially "fluctuate" across the query result, is it
> >> correct?
> >>
> >> Thank you
> >>
> >> On 15 April 2015 at 15:54, Niranjan <niranjan81(at)gmail(dot)com> wrote:
> >>> Hi,
> >>> I am sure the variation depends on the query plan decided by the engine
> >>> at
> >>> the time of execution, in short, AFAIK (as far as I know) the answer is
> >>> No.
> >>> There is no guarantee that the rows will be returned in the exact
> order.
> >>>
> >>> Some of the possible answers to be further explored would be
> >>> A) if we want the data to be always randomly ordered explore function
> >>> random(), or is it rand() I guess? to be included in the query.
> >>> B) if we want the data to be arbitrarily ordered but maintain the order
> >>> for
> >>> each query execution, save a randomly ordered same number of records
> in a
> >>> different table and then write a joined query.
> >>>
> >>> I hope it helps.
> >>>
> >>> Thanks & Regards
> >>> Niranjan D. Pandit
> >>>
> >>> On 15-Apr-2015 5:11 pm, "Michele Bosi" <michele(dot)bosi(at)gmail(dot)com> wrote:
> >>>>
> >>>> Hi all,
> >>>>
> >>>> I would like to know if postgresql is guaranteed to return the rows
> >>>> always in the same order in the following case:
> >>>>
> >>>> table "boxes"
> >>>> -------------------
> >>>> width | height
> >>>> 1 | 5
> >>>> 1 | 3
> >>>> 1 | 7
> >>>> 1 | 2
> >>>> 1 | 4
> >>>>
> >>>> SELECT * FROM boxes ORDER BY width;
> >>>>
> >>>> result:
> >>>> 1 | 5
> >>>> 1 | 3
> >>>> 1 | 7
> >>>> 1 | 2
> >>>> 1 | 4
> >>>>
> >>>> In practice we've seen that postgresql returns the rows in some kind
> >>>> of stably arbitrary order (which is expected because "width" contains
> >>>> non unique values) but the subtle ambiguity here (which is important
> >>>> for our application) is whether postresql is guaranteed to always
> >>>> return the same arbitrary order in a stable fashion or if it can (even
> >>>> theoretically) return an always different aribtrary order every time I
> >>>> perform the query, say for example:
> >>>>
> >>>> SELECT * FROM boxes ORDER BY width;
> >>>> 1 | 3
> >>>> 1 | 5
> >>>> 1 | 7
> >>>> 1 | 4
> >>>> 1 | 2
> >>>>
> >>>> SELECT * FROM boxes ORDER BY width;
> >>>> 1 | 2
> >>>> 1 | 5
> >>>> 1 | 7
> >>>> 1 | 4
> >>>> 1 | 3
> >>>>
> >>>> etc.
> >>>>
> >>>> The only two relevant lines I've found in the docs
> >>>> (
> http://www.postgresql.org/docs/current/interactive/queries-order.html)
> >>>> are:
> >>>>
> >>>> "A particular output ordering can only be guaranteed if the sort step
> >>>> is explicitly chosen." which does not tell us in what order rows with
> >>>> same sorting-key value are returned.
> >>>>
> >>>> and
> >>>>
> >>>> "If sorting is not chosen, the rows will be returned in an unspecified
> >>>> order." which I would expect to be the case also when sorting is
> >>>> chosen BUT the sorting-key values are always the same. However this
> >>>> still does not tell us if the "unspecified order" is always the same
> >>>> or not, ie. if it's stable. Another way to put this last point would
> >>>> be: is "SELECT * FROM boxes LIMIT 3" guaranteed to always return the
> >>>> same 3 rows (assuming I don't insert or delete new rows) or can
> >>>> theoretically return any different 3 every time I query?
> >>>>
> >>>> Thank you.
> >>>>
> >>>>
> >>>> --
> >>>> Sent via pgsql-in-general mailing list (
> pgsql-in-general(at)postgresql(dot)org)
> >>>> To make changes to your subscription:
> >>>> http://www.postgresql.org/mailpref/pgsql-in-general
> >>
> >>
> >> --
> >> Sent via pgsql-in-general mailing list (pgsql-in-general(at)postgresql(dot)org
> )
> >> To make changes to your subscription:
> >> http://www.postgresql.org/mailpref/pgsql-in-general
> >>
>

In response to

Responses

Browse pgsql-in-general by date

  From Date Subject
Next Message Niranjan 2015-04-16 02:02:00 Re: Result set stability in queries sorted by column with non-unique values
Previous Message Michele Bosi 2015-04-15 17:31:40 Re: Result set stability in queries sorted by column with non-unique values