AW: AW: Optimizer Hint, to ignore limit and offset in optimizer plan

From: Martin Handsteiner <martin(dot)handsteiner(at)sibvisions(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Simon Riggs <simon(at)2ndquadrant(dot)com>, "pgsql-sql(at)lists(dot)postgresql(dot)org" <pgsql-sql(at)lists(dot)postgresql(dot)org>
Subject: AW: AW: Optimizer Hint, to ignore limit and offset in optimizer plan
Date: 2020-06-29 17:33:10
Message-ID: AM4PR1001MB1412D79D3E6A863EEE913960E86E0@AM4PR1001MB1412.EURPRD10.PROD.OUTLOOK.COM
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-sql

> Not very good, unless you can present a far more convincing use-case.

The use-case is saving memory and increase speed for showing data in applications.
This sounds simple, but please have a look at the scenarios and problems.

I'm aware, that an optimizer hint "enable_costlimit = true (default) | false" will not guarantee 100% transaction save data, of course.

This is also not the case by using an order by, there is still the same problem.
(for eg, forums are paged with limit, and ordered by creation date desc of the post. Switching to the second page (limit 10 offset 10) can show a post again, that was already on the first page, when there is a new post in the meanwhile.
But mainly every user can live with the result, as long as there are not completely nondeterministic results on paging)

In complex applications, there have to be shown a lot of data, sometimes really complex views.

Some Use Cases we have:
- A complex, already tuned view that delivers the first rows of data in a view seconds. The user can page and view the next data, the paging will get slower, every page.
On a point, he has to define better search parameters, or the user can start a job that creates a full report without using the limit, but it takes time.
Very often the users are satisfied with the first pages. (They can work with the first pages of the result. On next day, due to their work they get again first pages on which they can work on)
Any order by will cause the view to take minutes, even if the first page will only show 10 rows, because the whole query has to be finished by the db and then ordered. This would not be practicable for any application, that the user has to wait minutes after a click...

- A view that delivers millions of result rows. The application has an sort on header functionality, where the user can sort on any column.
So there is no chance of setting a order by that will ensure a proper page order (eg sort on groupname).

If limit and offset would not be taken into account by optimizer, the above use cases would work acceptable. For transaction save problems, there has to always created a solution without limit and offset, of course.
But exact the use cases where the limit and offset is needed (reducing memory and increase speed), the limit and offset clause causes non deterministic results for paging.

> I'd counsel taking another look at the suggestion made upthread to use a cursor WITH HOLD.

I will try, but I have found so far searching the internet, that the postgres jdbc driver does not support holdable cursors.
I have to check it by my own.
Anyway, I was already told, that this belongs to the jdbc mailing list, and seams to be for now not an option as solution, if it is not supported by jdbc.

Regards
Martin Handsteiner

-----Ursprüngliche Nachricht-----
Von: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Gesendet: Montag, 29. Juni 2020 16:30
An: Martin Handsteiner <martin(dot)handsteiner(at)sibvisions(dot)com>
Cc: Simon Riggs <simon(at)2ndquadrant(dot)com>; pgsql-sql(at)lists(dot)postgresql(dot)org
Betreff: Re: AW: Optimizer Hint, to ignore limit and offset in optimizer plan

Martin Handsteiner <martin(dot)handsteiner(at)sibvisions(dot)com> writes:
> yes, this would exactly be the feature I was talking about.
> enable_costlimit = true (default) | false How high is the possibility,
> that this feature will find the way into the postgres db?

Not very good, unless you can present a far more convincing use-case.

If I understand your desire, it is that issuing the "same" query in different transactions would generate identical overall results despite varying the offset/limit so as to fetch different parts of that unchanging result.

The problem with this is that whether changing the offset/limit changes the plan shape is just one of many reasons why you might not get a consistent result --- the most unavoidable being that other transactions might commit data changes.

Moreover, you insist that you shouldn't have to use an ORDER BY to get these consistent results. Sorry, but SQL is *defined* to not produce consistent row ordering without ORDER BY. Changing that isn't a matter of some optimizer hint somewhere, it's a very fundamental thing in many places.

I'd counsel taking another look at the suggestion made upthread to use a cursor WITH HOLD.

regards, tom lane

In response to

Browse pgsql-sql by date

  From Date Subject
Next Message Mike Martin 2020-07-08 10:31:55 Partition by list - is select possible?
Previous Message Bruce Momjian 2020-06-29 15:48:01 Re: AW: Optimizer Hint, to ignore limit and offset in optimizer plan