Re: Make printtup a bit faster

From: David Rowley <dgrowleyml(at)gmail(dot)com>
To: Andy Fan <zhihuifan1213(at)163(dot)com>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Make printtup a bit faster
Date: 2024-08-29 11:51:48
Message-ID: CAApHDvrBNA-QRsbn-SJyRAsywjHCNenfcbi14c3O3c7=OimQ8Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, 29 Aug 2024 at 21:40, Andy Fan <zhihuifan1213(at)163(dot)com> wrote:
>
>
> Usually I see printtup in the perf-report with a noticeable ratio.

> Part of the slowness is caused by "memcpy", "strlen" and palloc in
> outfunction.

Yeah, it's a pretty inefficient API from a performance point of view.

> My high level proposal is define a type specific print function like:
>
> oidprint(Datum datum, StringInfo buf)
> textprint(Datum datum, StringInfo buf)

I think what we should do instead is make the output functions take a
StringInfo and just pass it the StringInfo where we'd like the bytes
written.

That of course would require rewriting all the output functions for
all the built-in types, so not a small task. Extensions make that job
harder. I don't think it would be good to force extensions to rewrite
their output functions, so perhaps some wrapper function could help us
align the APIs for extensions that have not been converted yet.

There's a similar problem with input functions not having knowledge of
the input length. You only have to look at textin() to see how useful
that could be. Fixing that would probably make COPY FROM horrendously
faster. Team that up with SIMD for the delimiter char search and COPY
go a bit better still. Neil Conway did propose the SIMD part in [1],
but it's just not nearly as good as it could be when having to still
perform the strlen() calls.

I had planned to work on this for PG18, but I'd be happy for some
assistance if you're willing.

David

[1] https://postgr.es/m/CAOW5sYb1HprQKrzjCsrCP1EauQzZy+njZ-AwBbOUMoGJHJS7Sw@mail.gmail.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Eisentraut 2024-08-29 12:15:50 Re: Virtual generated columns
Previous Message Melih Mutlu 2024-08-29 11:30:27 Re: ANALYZE ONLY