Re: Result set stability in queries sorted by column with non-unique values

From: Michele Bosi <michele(dot)bosi(at)gmail(dot)com>
To: "Podrigal, Aron" <aronp(at)guaranteedplus(dot)com>, pgsql-in-general(at)postgresql(dot)org
Subject: Re: Result set stability in queries sorted by column with non-unique values
Date: 2015-04-15 17:31:40
Message-ID: CACmQ0_J6f+muxkBfeyWyrFJBHyiWECmY=QPiA3-nmsx=_oR1tg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-in-general

That's what I heard but I could not find any convincing reference in
the docs or elsewhere, do you know of any?

On 15 April 2015 at 17:54, Podrigal, Aron <aronp(at)guaranteedplus(dot)com> wrote:
> I think the non-unique cols are usually returned in the order they're
> stored on disk.
>
>
> Aron
>
> On 4/15/15, Michele Bosi <michele(dot)bosi(at)gmail(dot)com> wrote:
>> Thanks Niranjan,
>>
>> In our case we require the rows returned to be in the same consistent
>> order across several queries as we need to paginate through a large
>> set of results by doing a combination of ORDER BY x, LIMIT, OFFSET.
>> However the sorting key "x" can contain non-unique values, thus we
>> need to be sure that rows with the same values end up in the same
>> position across the different queries we do per each result "page".
>>
>> As I understand, unless we add a secondary sorting key (like the "id"
>> of the row) which is guaranteed to be unique, we might end up with
>> rows that can potentially "fluctuate" across the query result, is it
>> correct?
>>
>> Thank you
>>
>> On 15 April 2015 at 15:54, Niranjan <niranjan81(at)gmail(dot)com> wrote:
>>> Hi,
>>> I am sure the variation depends on the query plan decided by the engine
>>> at
>>> the time of execution, in short, AFAIK (as far as I know) the answer is
>>> No.
>>> There is no guarantee that the rows will be returned in the exact order.
>>>
>>> Some of the possible answers to be further explored would be
>>> A) if we want the data to be always randomly ordered explore function
>>> random(), or is it rand() I guess? to be included in the query.
>>> B) if we want the data to be arbitrarily ordered but maintain the order
>>> for
>>> each query execution, save a randomly ordered same number of records in a
>>> different table and then write a joined query.
>>>
>>> I hope it helps.
>>>
>>> Thanks & Regards
>>> Niranjan D. Pandit
>>>
>>> On 15-Apr-2015 5:11 pm, "Michele Bosi" <michele(dot)bosi(at)gmail(dot)com> wrote:
>>>>
>>>> Hi all,
>>>>
>>>> I would like to know if postgresql is guaranteed to return the rows
>>>> always in the same order in the following case:
>>>>
>>>> table "boxes"
>>>> -------------------
>>>> width | height
>>>> 1 | 5
>>>> 1 | 3
>>>> 1 | 7
>>>> 1 | 2
>>>> 1 | 4
>>>>
>>>> SELECT * FROM boxes ORDER BY width;
>>>>
>>>> result:
>>>> 1 | 5
>>>> 1 | 3
>>>> 1 | 7
>>>> 1 | 2
>>>> 1 | 4
>>>>
>>>> In practice we've seen that postgresql returns the rows in some kind
>>>> of stably arbitrary order (which is expected because "width" contains
>>>> non unique values) but the subtle ambiguity here (which is important
>>>> for our application) is whether postresql is guaranteed to always
>>>> return the same arbitrary order in a stable fashion or if it can (even
>>>> theoretically) return an always different aribtrary order every time I
>>>> perform the query, say for example:
>>>>
>>>> SELECT * FROM boxes ORDER BY width;
>>>> 1 | 3
>>>> 1 | 5
>>>> 1 | 7
>>>> 1 | 4
>>>> 1 | 2
>>>>
>>>> SELECT * FROM boxes ORDER BY width;
>>>> 1 | 2
>>>> 1 | 5
>>>> 1 | 7
>>>> 1 | 4
>>>> 1 | 3
>>>>
>>>> etc.
>>>>
>>>> The only two relevant lines I've found in the docs
>>>> (http://www.postgresql.org/docs/current/interactive/queries-order.html)
>>>> are:
>>>>
>>>> "A particular output ordering can only be guaranteed if the sort step
>>>> is explicitly chosen." which does not tell us in what order rows with
>>>> same sorting-key value are returned.
>>>>
>>>> and
>>>>
>>>> "If sorting is not chosen, the rows will be returned in an unspecified
>>>> order." which I would expect to be the case also when sorting is
>>>> chosen BUT the sorting-key values are always the same. However this
>>>> still does not tell us if the "unspecified order" is always the same
>>>> or not, ie. if it's stable. Another way to put this last point would
>>>> be: is "SELECT * FROM boxes LIMIT 3" guaranteed to always return the
>>>> same 3 rows (assuming I don't insert or delete new rows) or can
>>>> theoretically return any different 3 every time I query?
>>>>
>>>> Thank you.
>>>>
>>>>
>>>> --
>>>> Sent via pgsql-in-general mailing list (pgsql-in-general(at)postgresql(dot)org)
>>>> To make changes to your subscription:
>>>> http://www.postgresql.org/mailpref/pgsql-in-general
>>
>>
>> --
>> Sent via pgsql-in-general mailing list (pgsql-in-general(at)postgresql(dot)org)
>> To make changes to your subscription:
>> http://www.postgresql.org/mailpref/pgsql-in-general
>>

In response to

Responses

Browse pgsql-in-general by date

  From Date Subject
Next Message Podrigal, Aron 2015-04-16 01:34:47 Re: Result set stability in queries sorted by column with non-unique values
Previous Message Michele Bosi 2015-04-15 15:54:11 Re: Result set stability in queries sorted by column with non-unique values