Re: Result set stability in queries sorted by column with non-unique values

From: Pavan Deolasee <pavan(dot)deolasee(at)gmail(dot)com>
To: Michele Bosi <michele(dot)bosi(at)gmail(dot)com>
Cc: Niranjan <niranjan81(at)gmail(dot)com>, "pgsql-in-general(at)postgresql(dot)org" <pgsql-in-general(at)postgresql(dot)org>
Subject: Re: Result set stability in queries sorted by column with non-unique values
Date: 2015-04-16 05:23:29
Message-ID: CABOikdMA0KJOnCQadURPdF6ssk3JwP8Ez-HQqdEMjcRm-1ATBw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-in-general

On Wed, Apr 15, 2015 at 9:24 PM, Michele Bosi <michele(dot)bosi(at)gmail(dot)com>
wrote:

> Thanks Niranjan,
>
> In our case we require the rows returned to be in the same consistent
> order across several queries as we need to paginate through a large
> set of results by doing a combination of ORDER BY x, LIMIT, OFFSET.
> However the sorting key "x" can contain non-unique values, thus we
> need to be sure that rows with the same values end up in the same
> position across the different queries we do per each result "page".
>
> As I understand, unless we add a secondary sorting key (like the "id"
> of the row) which is guaranteed to be unique, we might end up with
> rows that can potentially "fluctuate" across the query result, is it
> correct?
>
>
There are two reasons why the ordering may differ between queries when
ORDER BY is on a non-unique column:

1. Different query plans being used for different queries
2. Physical ordering of rows has varied because of UPDATE/DELETES and
either VACUUM or HOT recycling dead space in the table.

If you control both these factors or somehow know that they are going to be
consistent across all queries, which pretty much means that all queries use
the same scan method for the table and the table itself is not undergoing
any DELETE/UPDATE, you could get the same row ordering. But clearly you
can't build your application under those assumptions and it will be far
better to add another column to the ORDER BY to guarantee same ordering.

Thanks,
Pavan

--
Pavan Deolasee
http://www.linkedin.com/in/pavandeolasee

In response to

Responses

Browse pgsql-in-general by date

  From Date Subject
Next Message Michele Bosi 2015-04-16 10:12:29 Re: Result set stability in queries sorted by column with non-unique values
Previous Message Deepak 2015-04-16 05:04:18 Re: Result set stability in queries sorted by column with non-unique values