Re: SELECT DISTINCT chooses parallel seqscan instead of indexscan on huge table with 1000 partitions

From: Dimitrios Apostolou <jimis(at)gmx(dot)net>
To: David Rowley <dgrowleyml(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-general(at)lists(dot)postgresql(dot)org
Subject: Re: SELECT DISTINCT chooses parallel seqscan instead of indexscan on huge table with 1000 partitions
Date: 2024-05-13 13:52:42
Message-ID: 98c715d1-22f4-0fc1-1997-6236873c13de@gmx.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Tue, 14 May 2024, David Rowley wrote:

> On Tue, 14 May 2024 at 00:41, Dimitrios Apostolou <jimis(at)gmx(dot)net> wrote:
>>
>> On Sat, 11 May 2024, David Rowley wrote:
>>> It will. It's just that Sorting requires fetching everything from its subnode.
>>
>> Isn't it plain wrong to have a sort step in the plan than? The different
>> partitions contain different value ranges with no overlap, and the last
>> query I posted doesn't even contain an ORDER BY clause, just a DISTINCT
>> clause on an indexed column.
>
> The query does contain an ORDER BY, so if the index is not chosen to
> provide pre-sorted input, then something has to put the results in the
> correct order before the LIMIT is applied.

The last query I tried was:

SELECT DISTINCT workitem_n FROM test_runs_raw LIMIT 10;

See my message at

[1] https://www.postgresql.org/message-id/69077f15-4125-2d63-733f-21ce6eac4f01%40gmx.net

Will re-check things and report back with further debugging info you asked
for later today.

Dimitris

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Dimitrios Apostolou 2024-05-13 14:07:19 Re: SELECT DISTINCT chooses parallel seqscan instead of indexscan on huge table with 1000 partitions
Previous Message David Rowley 2024-05-13 13:34:05 Re: SELECT DISTINCT chooses parallel seqscan instead of indexscan on huge table with 1000 partitions