From: | Andy Fan <zhihuifan1213(at)163(dot)com> |
---|---|
To: | David Rowley <dgrowleyml(at)gmail(dot)com> |
Cc: | pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>,Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Subject: | Re: Make printtup a bit faster |
Date: | 2024-08-30 00:09:43 |
Message-ID: | 87bk1aj2go.fsf@163.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
David Rowley <dgrowleyml(at)gmail(dot)com> writes:
Hello David,
>> My high level proposal is define a type specific print function like:
>>
>> oidprint(Datum datum, StringInfo buf)
>> textprint(Datum datum, StringInfo buf)
>
> I think what we should do instead is make the output functions take a
> StringInfo and just pass it the StringInfo where we'd like the bytes
> written.
>
> That of course would require rewriting all the output functions for
> all the built-in types, so not a small task. Extensions make that job
> harder. I don't think it would be good to force extensions to rewrite
> their output functions, so perhaps some wrapper function could help us
> align the APIs for extensions that have not been converted yet.
I have the similar concern as Tom that this method looks too
aggressive. That's why I said:
"If a type's print function is not defined, we can still using the out
function."
AND
"Hard coding the relationship between [common] used type and {type}print
function OID looks not cool, Adding a new attribute in pg_type looks too
aggressive however. Anyway this is the next topic to talk about."
What would be the extra benefit we redesign all the out functions?
> There's a similar problem with input functions not having knowledge of
> the input length. You only have to look at textin() to see how useful
> that could be. Fixing that would probably make COPY FROM horrendously
> faster. Team that up with SIMD for the delimiter char search and COPY
> go a bit better still. Neil Conway did propose the SIMD part in [1],
> but it's just not nearly as good as it could be when having to still
> perform the strlen() calls.
OK, I think I can understand the needs to make in-function knows the
input length and good to know the SIMD part for delimiter char
search. strlen looks like a delimiter char search ('\0') as well. Not
sure if "strlen" has been implemented with SIMD part, but if not, why?
> I had planned to work on this for PG18, but I'd be happy for some
> assistance if you're willing.
I see you did many amazing work with cache-line-frindly data struct
design, branch predition optimization and SIMD optimization. I'd like to
try one myself. I'm not sure if I can meet the target, what if we handle
the out/in function separately (can be by different people)?
--
Best Regards
Andy Fan
From | Date | Subject | |
---|---|---|---|
Next Message | David Rowley | 2024-08-30 00:31:07 | Re: Make printtup a bit faster |
Previous Message | Michael Paquier | 2024-08-29 23:59:31 | Re: Removing log_cnt from pg_sequence_read_tuple() |