Make printtup a bit faster

From: Andy Fan <zhihuifan1213(at)163(dot)com>
To: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Make printtup a bit faster
Date: 2024-08-29 09:40:14
Message-ID: 87wmjzfz0h.fsf@163.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers


Usually I see printtup in the perf-report with a noticeable ratio. Take
"SELECT * FROM pg_class" for example, we can see:

85.65% 3.25% postgres postgres [.] printtup

The high level design of printtup is:

1. Used a pre-allocated StringInfo DR_printtup.buf to store data for
each tuples.
2. for each datum in the tuple, it calls the type-specific out function
and get a cstring.
3. after get the cstring, we figure out the "len" and add both len and
'data' into DR_printtup.buf.
4. after all the datums are handled, socket_putmessage copies them into
PqSendBuffer.
5. When the usage of PgSendBuffer is up to PqSendBufferSize, using send
syscall to sent them into client (by copying the data from userspace to
kernel space again).

Part of the slowness is caused by "memcpy", "strlen" and palloc in
outfunction.

8.35% 8.35% postgres libc.so.6 [.] __strlen_avx2
4.27% 0.00% postgres libc.so.6 [.] __memcpy_avx_unaligned_erms
3.93% 3.93% postgres postgres [.] palloc (part of them caused by
out function)
5.70% 5.70% postgres postgres [.] AllocSetAlloc (part of them
caused by printtup.)

My high level proposal is define a type specific print function like:

oidprint(Datum datum, StringInfo buf)
textprint(Datum datum, StringInfo buf)

This function should append both data and len into buf directly.

for the oidprint case, we can avoid:

5. the dedicate palloc in oid function.
6. the memcpy from the above memory into DR_printtup.buf

for the textprint case, we can avoid

7. strlen, since we can figure out the length from varlena.vl_len

int2/4/8/timestamp/date/time are similar with oid. and numeric, varchar
are similar with text. This almost covers all the common used type.

Hard coding the relationship between common used type and {type}print
function OID looks not cool, Adding a new attribute in pg_type looks too
aggressive however. Anyway this is the next topic to talk about.

If a type's print function is not defined, we can still using the out
function (and PrinttupAttrInfo caches FmgrInfo rather than
FunctionCallInfo, so there is some optimization in this step as well).

This proposal covers the step 2 & 3. If we can do something more
aggressively, we can let the xxxprint print to PqSendBuffer directly,
but this is more complex and need some infrastructure changes. the
memcpy in step 4 is: "1.27% __memcpy_avx_unaligned_erms" in my above
case.

What do you think?

--
Best Regards
Andy Fan

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Bertrand Drouvot 2024-08-29 10:14:04 Re: Add contrib/pg_logicalsnapinspect
Previous Message shveta malik 2024-08-29 09:34:57 Re: Allow logical failover slots to wait on synchronous replication