Re: general purpose array_sort

From: Junwang Zhao <zhjwpku(at)gmail(dot)com>
To: Amit Langote <amitlangote09(at)gmail(dot)com>
Cc: jian he <jian(dot)universality(at)gmail(dot)com>, "David G(dot) Johnston" <david(dot)g(dot)johnston(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, "andreas(at)proxel(dot)se" <andreas(at)proxel(dot)se>, Robert Haas <robertmhaas(at)gmail(dot)com>
Subject: Re: general purpose array_sort
Date: 2024-10-09 15:46:19
Message-ID: CAEG8a3J=yxYvGpnn3WzsjmkgWQ3BraUUo03gsww5jtejUPmGxQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Oct 9, 2024 at 10:10 PM Junwang Zhao <zhjwpku(at)gmail(dot)com> wrote:
>
> Hi Amit,
>
> On Thu, Oct 3, 2024 at 2:22 PM Amit Langote <amitlangote09(at)gmail(dot)com> wrote:
> >
> > Hi Junwang,
> >
> > On Wed, Oct 2, 2024 at 11:46 PM Junwang Zhao <zhjwpku(at)gmail(dot)com> wrote:
> > > On Wed, Oct 2, 2024 at 9:51 AM jian he <jian(dot)universality(at)gmail(dot)com> wrote:
> > > >
> > > > > >
> > > > > > + typentry = (TypeCacheEntry *) fcinfo->flinfo->fn_extra;
> > > > > > + if (typentry == NULL || typentry->type_id != elmtyp)
> > > > > > + {
> > > > > > + typentry = lookup_type_cache(elmtyp, sort_asc ? TYPECACHE_LT_OPR :
> > > > > > TYPECACHE_GT_OPR);
> > > > > > + fcinfo->flinfo->fn_extra = (void *) typentry;
> > > > > > + }
> > > > > > you need to one-time check typentry->lt_opr or typentry->gt_opr exists?
> > > > > > see CreateStatistics.
> > > > > > /* Disallow data types without a less-than operator */
> > > > > > type = lookup_type_cache(attForm->atttypid, TYPECACHE_LT_OPR);
> > > > > > if (type->lt_opr == InvalidOid)
> > > > > > ereport(ERROR,
> > > > > > (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
> > > > > > errmsg("column \"%s\" cannot be used in
> > > > > > statistics because its type %s has no default btree operator class",
> > > > > > attname, format_type_be(attForm->atttypid))));
> > > > >
> > > > > I added an Assert for this part, not sure if that is enough.
> > > > >
> > > >
> > > > i think it really should be:
> > > >
> > > > if (typentry == NULL || typentry->type_id != elmtyp)
> > > > {
> > > > typentry = lookup_type_cache(elmtyp, sort_asc ? TYPECACHE_LT_OPR :
> > > > TYPECACHE_GT_OPR);
> > > > fcinfo->flinfo->fn_extra = (void *) typentry;
> > > > if ((sort_asc && !OidIsValid(typentry->lt_opr) || (!sort_as &&
> > > > OidIsValid(typentry->gt_opr));
> > > > ereport(ERROR,....)
> > > > }
> > > >
> > > > Imagine a type that doesn't have TYPECACHE_LT_OPR or TYPECACHE_GT_OPR
> > > > then we cannot do the sort, we should just error out.
> > > >
> > > > I just tried this colour type [1] with (CREATE TYPE colour (INPUT =
> > > > colour_in, OUTPUT = colour_out, LIKE = pg_catalog.int4);
> > > >
> > > > select array_sort('{#FF0000, #FF0000}'::colour[]);
> > > > of course it will segfault with your new Assert.
> > > >
> > > >
> > > > [1] https://github.com/hlinnaka/colour-datatype/blob/master/colour.c
> > >
> > > Make sense, PFA v5 with Jian's suggestion.
> >
> > Have you noticed that the tests have failed on Cirrus CI runs of this patch?
> >
> > https://cirrus-ci.com/github/postgresql-cfbot/postgresql/cf%2F5277
>
> Sorry for the late reply due to my vacation. I should have paid
> more attention to Cirrus CI earlier ;)
>
> >
> > It might be related to the test machines having a different *default*
> > locale than your local environment, which could result in a different
> > sort order for the test data. You may need to add an explicit COLLATE
> > clause to the tests to ensure consistent sorting across systems.
>
> I've changed the tests to use just ASCII characters, then added
> *COLLATE "C"* to the tests and CI passed, PFA v6.

Sadly the CI only passed on my own github repo, it failed on
cfbot[1], will dig into the reason later because I can not open the cirrus
ci page right now ;(

[1] https://cirrus-ci.com/task/5815925960605696

>
> >
> > --
> > Thanks, Amit Langote
>
>
>
> --
> Regards
> Junwang Zhao

--
Regards
Junwang Zhao

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Joel Jacobson 2024-10-09 15:48:57 Re: Should CSV parsing be stricter about mid-field quotes?
Previous Message Alexander Kuznetsov 2024-10-09 15:23:40 Re: [PATCH] Check for TupleTableSlot nullness before dereferencing