Re: NAMEDATALEN increase because of non-latin languages

From: John Naylor <john(dot)naylor(at)enterprisedb(dot)com>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Matthias van de Meent <boekewurm+postgres(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Julien Rouhaud <rjuju123(at)gmail(dot)com>, Денис Романенко <deromanenko(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: NAMEDATALEN increase because of non-latin languages
Date: 2022-07-22 07:52:43
Message-ID: CAFBsxsFmJGiNpWkvdmLGD-YGt6hU_xRxcLJEz+JHJ4=C8qvoaQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Jul 19, 2022 at 10:57 PM Andres Freund <andres(at)anarazel(dot)de> wrote:
>
> Hi,
>
> On 2022-07-19 14:30:34 +0700, John Naylor wrote:
> > I'm thinking where the first few attributes are fixed length, not null,
and
> > (because of AIX) not double-aligned, we can do a single memcpy on
multiple
> > columns at once. That will still be a common pattern after namedata is
> > varlen. Otherwise, use helper functions/macros similar to the above but
> > instead of passing a tuple descriptor, use info we have at compile time.
>
> I think that might be over-optimizing things. I don't think we do these
> conversions at a rate that's high enough to warrant it - the common stuff
> should be in relcache etc. It's possible that we might want to optimize
the
> catcache case specifically - but that'd be more optimizing memory usage
than
> "conversion" imo.

Okay, here is a hackish experiment that applies on top of v2 but also
invalidates some of that earlier work. Since there is already a pg_cast.c,
I demoed a new function there which looks like this:

void
Deform_pg_cast_tuple(Form_pg_cast pg_cast_struct, HeapTuple pg_cast_tuple,
TupleDesc pg_cast_desc)
{
Datum values[Natts_pg_cast];
bool isnull[Natts_pg_cast];

heap_deform_tuple(pg_cast_tuple, pg_cast_desc, values, isnull);

pg_cast_struct->oid = DatumGetObjectId(values[Anum_pg_cast_oid - 1]);
pg_cast_struct->castsource =
DatumGetObjectId(values[Anum_pg_cast_castsource - 1]);
pg_cast_struct->casttarget =
DatumGetObjectId(values[Anum_pg_cast_casttarget - 1]);
pg_cast_struct->castfunc =
DatumGetObjectId(values[Anum_pg_cast_castfunc - 1]);
pg_cast_struct->castcontext =
DatumGetChar(values[Anum_pg_cast_castcontext - 1]);
pg_cast_struct->castmethod =
DatumGetChar(values[Anum_pg_cast_castmethod - 1]);
}

For the general case we can use pg_*_deform.c or something like that, with
extern declarations in the main headers. To get this to work, I had to add
a couple pointless table open/close calls to get the tuple descriptor,
since currently the whole tuple is stored in the syscache, but that's not
good even as a temporary measure. Storing the full struct in the syscache
is a good future step, as noted upthread, but to get there without a bunch
more churn, maybe the above function can copy the tuple descriptor into a
local stack variable from an expanded version of schemapg.h. Once the
deformed structs are stored in caches, I imagine most of the times we want
to deform are when we have the table open, and we can pass the descriptor
as above without additional code.

--
John Naylor
EDB: http://www.enterprisedb.com

Attachment Content-Type Size
deform-pg_cast-using-standard-function.patch application/x-patch 6.0 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Dean Rasheed 2022-07-22 07:59:20 Re: [PATCH] Introduce array_shuffle() and array_sample()
Previous Message Alvaro Herrera 2022-07-22 06:49:31 Re: pg_tablespace_location() failure with allow_in_place_tablespaces