Re: Performance of "distinct with limit"

From: Klaudie Willis <Klaudie(dot)Willis(at)protonmail(dot)com>
To: "luis(dot)roberto(at)siscobra(dot)com(dot)br" <luis(dot)roberto(at)siscobra(dot)com(dot)br>
Cc: pgsql-general <pgsql-general(at)lists(dot)postgresql(dot)org>
Subject: Re: Performance of "distinct with limit"
Date: 2020-08-28 12:34:09
Message-ID: KK_HX4z5BF8lq32xzIFQnnBgcFgV9m7egEHQc0EHzfhhBw4Ir5TTqlE-MLbD7-C_WXY6wwzx2dKRU6LG6TY2r80zgWjhe5iyeclfMIRXdls=@protonmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

No index on n, no. Index might solve it yes, but it seems to me such a trivial optimization even without. Obviously it is not.

QUERY PLAN |

----------------------------------------------------------------------------------|

Limit (cost=1911272.10..1911272.12 rows=2 width=7) |

-> HashAggregate (cost=1911272.10..1911282.45 rows=1035 width=7) |

Group Key: cfi |

-> Seq Scan on bigtable (cost=0.00..1817446.08 rows=37530408 width=7)|

Klaudie

‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐
On Friday, August 28, 2020 1:59 PM, <luis(dot)roberto(at)siscobra(dot)com(dot)br> wrote:

> Hi,
>
> If "n" is indexed, it should run quickly. Can you share the execution plan for your query?
>
> ---------------------------------------------------------------
>
> De: "Klaudie Willis" <Klaudie(dot)Willis(at)protonmail(dot)com>
> Para: "pgsql-general" <pgsql-general(at)lists(dot)postgresql(dot)org>
> Enviadas: Sexta-feira, 28 de agosto de 2020 8:29:58
> Assunto: Performance of "distinct with limit"
>
> Hi,
>
> Ran into this under-optimized query execution.
>
> select distinct n from bigtable; -- Lets say this takes 2 minutes
> select distinct n from bigtable limit 2 -- This takes approximately the same time
>
> However, the latter should have the potential to be so much quicker. I checked the same query on MSSQL (with 'top 2'), and it seems to do exactly the optimization I would expect.
>
> Is there any way to achieve a similar speedup in Postgresql?
>
> Klaudie

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Thorsten Schöning 2020-08-29 08:24:04 How to properly query lots of rows based on timestamps?
Previous Message luis.roberto 2020-08-28 11:59:08 Re: Performance of "distinct with limit"