Re: Why doesn't Postgres apply limit on groups when retrieving N results per group using WHERE IN + ORDER BY

From: Olivier Gautherot <ogautherot(at)gautherot(dot)net>
To: Sean v <sean(at)vanmulligen(dot)ca>
Cc: "David G(dot) Johnston" <david(dot)g(dot)johnston(at)gmail(dot)com>, Ron Johnson <ronljohnsonjr(at)gmail(dot)com>, pgsql-general <pgsql-general(at)postgresql(dot)org>
Subject: Re: Why doesn't Postgres apply limit on groups when retrieving N results per group using WHERE IN + ORDER BY
Date: 2024-02-07 07:20:01
Message-ID: CAJ7S9TV2=_wBe1k4Pn8Pmadw7WBD-SE9u5+iwT3OsXCRVri1hg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

El mié, 7 feb 2024 8:07, Sean v <sean(at)vanmulligen(dot)ca> escribió:

> Exactly. I'm really just trying to understand if there's some functional
> limitation to it being able to do that with how it executes these types of
> queries, or if its just an optimization that hasn't been built into the
> query planner yet.
>
> I know I can get it to do precisely this if I use a CROSS JOIN LATERAL:
>
> SELECT o.*FROM company_users cuCROSS JOIN LATERAL (
> SELECT *
> FROM orders o
> WHERE o.user_id = company_users.user_id
> ORDER BY created_at DESC LIMIT 50
> ) cuWHERE cu.company_id = ? ORDER BY created_at DESC LIMIT 50
>
> That makes sense to me, it forces a nested loop and executes for each
> user. But doing a nested select like the query below doesn't use the index
> or limit the results to 50 per user - even though it does a nested loop
> just like the lateral join does:
>
> SELECT "orders".* FROM "orders" WHERE user_id IN (SELECT user_id FROM company_users WHERE company_id = ?)ORDER BY "orders"."created_at" LIMIT 50
>
>
Joins will generally query the whole tables, leading to long run times.
Have you tried to preselect the rows of interest with a "WITH ... SELECT
..." query to reduce the amount of data processed?

On 2024-02-05 7:58 a.m., David G. Johnston wrote:
>
> On Mon, Feb 5, 2024 at 8:55 AM Ron Johnson <ronljohnsonjr(at)gmail(dot)com>
> wrote:
>
>>
>> Who knows which users are going to be in that list???
>>
>>
> It doesn't matter. Worse case scenario there is only one user in the
> result and so all 50 rows are their earliest 50 rows. The system will thus
> never need more than the earliest 50 rows per user to answer this question.
>
> David J.
>
>
Cheers
Olivier

>
>

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Carsten Klein 2024-02-07 10:28:25 Generic File Access Function to read program output
Previous Message Sean v 2024-02-07 07:06:58 Re: Why doesn't Postgres apply limit on groups when retrieving N results per group using WHERE IN + ORDER BY