Re: Make printtup a bit faster

From: Andy Fan <zhihuifan1213(at)163(dot)com>
To: David Rowley <dgrowleyml(at)gmail(dot)com>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>,Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Subject: Re: Make printtup a bit faster
Date: 2024-08-30 00:09:43
Message-ID: 87bk1aj2go.fsf@163.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

David Rowley <dgrowleyml(at)gmail(dot)com> writes:

Hello David,

>> My high level proposal is define a type specific print function like:
>>
>> oidprint(Datum datum, StringInfo buf)
>> textprint(Datum datum, StringInfo buf)
>
> I think what we should do instead is make the output functions take a
> StringInfo and just pass it the StringInfo where we'd like the bytes
> written.
>
> That of course would require rewriting all the output functions for
> all the built-in types, so not a small task. Extensions make that job
> harder. I don't think it would be good to force extensions to rewrite
> their output functions, so perhaps some wrapper function could help us
> align the APIs for extensions that have not been converted yet.

I have the similar concern as Tom that this method looks too
aggressive. That's why I said:

"If a type's print function is not defined, we can still using the out
function."

AND

"Hard coding the relationship between [common] used type and {type}print
function OID looks not cool, Adding a new attribute in pg_type looks too
aggressive however. Anyway this is the next topic to talk about."

What would be the extra benefit we redesign all the out functions?

> There's a similar problem with input functions not having knowledge of
> the input length. You only have to look at textin() to see how useful
> that could be. Fixing that would probably make COPY FROM horrendously
> faster. Team that up with SIMD for the delimiter char search and COPY
> go a bit better still. Neil Conway did propose the SIMD part in [1],
> but it's just not nearly as good as it could be when having to still
> perform the strlen() calls.

OK, I think I can understand the needs to make in-function knows the
input length and good to know the SIMD part for delimiter char
search. strlen looks like a delimiter char search ('\0') as well. Not
sure if "strlen" has been implemented with SIMD part, but if not, why?

> I had planned to work on this for PG18, but I'd be happy for some
> assistance if you're willing.

I see you did many amazing work with cache-line-frindly data struct
design, branch predition optimization and SIMD optimization. I'd like to
try one myself. I'm not sure if I can meet the target, what if we handle
the out/in function separately (can be by different people)?

--
Best Regards
Andy Fan

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message David Rowley 2024-08-30 00:31:07 Re: Make printtup a bit faster
Previous Message Michael Paquier 2024-08-29 23:59:31 Re: Removing log_cnt from pg_sequence_read_tuple()