Re: sorting problem with distinct on()

From: Sky Lendar <skylendar(at)gmail(dot)com>
To: "David G(dot) Johnston" <david(dot)g(dot)johnston(at)gmail(dot)com>
Cc: pgsql-general(at)lists(dot)postgresql(dot)org
Subject: Re: sorting problem with distinct on()
Date: 2023-05-05 14:25:18
Message-ID: CAJYOnMtgkeHkC-Kra=YAi2S=j1+DUu9LyWxkYDpxSJQMuwom9w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Thank you for the clarification.

Le ven. 5 mai 2023 à 16:16, David G. Johnston <david(dot)g(dot)johnston(at)gmail(dot)com>
a écrit :

> On Fri, May 5, 2023 at 6:55 AM Sky Lendar <skylendar(at)gmail(dot)com> wrote:
>
>> Hi there ans thx for reading and answering this post if you can:
>>
>> Let's regard an example of a table (stars) containing a code for a star
>> (symb)
>> and its index (nb) in a file.
>>
>>
>> nb | symb
>> ----+------------
>> 0 | alTau
>> 1 | alTau
>> 2 | bePer
>> 3 | alSco
>> 4 | alLeo
>> 5 | alCMa
>> 6 | alVir
>> 7 | Trappist-1
>> 8 | SgrA*
>> 9 | SgrA*
>> 10 | GA
>> 11 | VC
>> 12 | M31
>> 13 | M44
>> 14 | alUMi
>> 15 | M87
>> 16 | alCyg
>> 17 | alCyg
>> 18 | beOri
>> 19 | omiCet
>> 20 | epTau
>> 21 | epCas
>> 22 | alAnd
>> 23 | alAnd
>> 24 | beAnd
>> 25 | ga-1And
>> 26 | ga-1And
>> 27 | ga-1And
>> 28 | ga-1And
>> 29 | xiAnd
>> 30 | upAnd
>> 31 | upAnd
>> 32 | 14And
>> 33 | 51And
>> 34 | M31
>> 35 | alAql
>> 36 | alAql
>> 37 | beAql
>> 38 | gaAql
>> 39 | deAql
>> 40 | epAql
>> 41 | zeAql
>> 42 | zeAql
>> 43 | etAql
>> 44 | thAql
>> 45 | ioAql
>> 46 | laAql
>> 47 | xiAql
>> 48 | 12Aql
>> 49 | alAqr
>> 50 | alAqr
>> 51 | beAqr
>> 52 | gaAqr
>> 53 | gaAqr
>> 54 | deAqr
>> 55 | epAqr
>> 56 | epAqr
>> 57 | ze-1Aqr
>> 58 | etAqr
>> 59 | etAqr
>> 60 | thAqr
>> 61 | kaAqr
>> 62 | laAqr
>> 63 | laAqr
>> 64 | laAqr
>> 65 | laAqr
>> 66 | nuAqr
>> 67 | piAqr
>> 68 | xiAqr
>> 69 | Trappist-1
>> 70 | alAra
>> 71 | muAra
>> 72 | alAri
>> 73 | beAri
>> 74 | beAri
>> 75 | gaAri
>> 76 | deAri
>> 77 | 39Ari
>> 78 | 41Ari
>> 79 | alAur
>> 80 | alAur
>> 81 | beAur
>> 82 | deAur
>> 83 | epAur
>> 84 | epAur
>> 85 | epAur
>> 86 | zeAur
>> 87 | zeAur
>> 88 | zeAur
>> 89 | zeAur
>> 90 | etAur
>> 91 | etAur
>> 92 | thAur
>> 93 | thAur
>> 94 | thAur
>> 95 | ioAur
>> 96 | ioAur
>> 97 | ioAur
>> 98 | alBoo
>> 99 | alBoo
>>
>> Notice that some symbs are duplicated and I want to select only the
>> distinct symbs.
>> So, I could use
>>
>> select distinct on(symb) * from stars;
>>
>> I get this result:
>>
>> nb | symb
>> ----+-------
>> 48 | 12Aql
>> 32 | 14And
>> 77 | 39Ari
>> 78 | 41Ari
>> 33 | 51And
>> 10 | GA
>> 34 | M31
>> 13 | M44
>> 15 | M87
>> 9 | SgrA*
>>
>> Notice that 12 is missing in the list.
>> Even with x as (select distinct on(symb) * from stars) select * from x
>> where nb = 12 order by nb;
>>
>
> nb = 12 is a duplicate with np = 34
>
> Since your DISTINCT ON *subquery* doesn't specify an ordering which of
> those two are chosen as the representative record for M31 is
> non-determinstic.
>
> If you want to ensure the lowest valued nb is chosen you need to sort the
> *subquery*. The first record the DISTINCT encounters is the one selected
> to represent.
>
> Sorting in the outer/main query happens after the DISTINCT and so the
> record is already gone.
>
> David J.
>

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Tom Lane 2023-05-05 14:26:18 Re: sorting problem with distinct on()
Previous Message Jeffrey Walton 2023-05-05 14:16:27 Re: SSL Enablement in Postgres via Client App like PG-ADMIN-4, Java.