Re: Why is pq_begintypsend so slow?

From: Ranier Vilela <ranier(dot)vf(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Andres Freund <andres(at)anarazel(dot)de>, David Fetter <david(at)fetter(dot)org>, Jeff Janes <jeff(dot)janes(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Why is pq_begintypsend so slow?
Date: 2020-05-18 17:54:03
Message-ID: CAEudQAphOafu0C4PRquVwXindHwrsnDGKpDPr4r57bJe-5zpUw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Em seg., 18 de mai. de 2020 às 13:38, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> escreveu:

> Andres Freund <andres(at)anarazel(dot)de> writes:
> >> FWIW, I've also observed, in another thread (the node func generation
> >> thing [1]), that inlining enlargeStringInfo() helps a lot, especially
> >> when inlining some of its callers. Moving e.g. appendStringInfo() inline
> >> allows the compiler to sometimes optimize away the strlen. But if
> >> e.g. an inlined appendBinaryStringInfo() still calls enlargeStringInfo()
> >> unconditionally, successive appends cannot optimize away memory accesses
> >> for ->len/->data.
>
> > With a set of patches doing so, int4send itself is not a significant
> > factor for my test benchmark [1] anymore.
>
> This thread seems to have died out, possibly because the last set of
> patches that Andres posted was sufficiently complicated and invasive
> that nobody wanted to review it. I thought about this again after
> seeing that [1] is mostly about pq_begintypsend overhead, and had
> an epiphany: there isn't really a strong reason for pq_begintypsend
> to be inserting bits into the buffer at all. The bytes will be
> filled by pq_endtypsend, and nothing in between should be touching
> them. So I propose 0001 attached. It's poking into the stringinfo
> abstraction a bit more than I would want to do if there weren't a
> compelling performance reason to do so, but there evidently is.
>
> With 0001, pq_begintypsend drops from being the top single routine
> in a profile of a test case like [1] to being well down the list.
> The next biggest cost compared to text-format output is that
> printtup() itself is noticeably more expensive. A lot of the extra
> cost there seems to be from pq_sendint32(), which is getting inlined
> into printtup(), and there probably isn't much we can do to make that
> cheaper. But eliminating a common subexpression as in 0002 below does
> help noticeably, at least with the rather old gcc I'm using.
>
Again, I see problems with the types declared in Postgres.
1. pq_sendint32 (StringInfo buf, uint32 i)
2. extern void pq_sendbytes (StringInfo buf, const char * data, int
datalen);

Wouldn't it be better to declare outputlen (0002) as uint32?
To avoid converting from (int) to (uint32), even if afterwards there is a
conversion from (uint32) to (int)?

regards,
Ranier Vilela

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2020-05-18 17:57:19 Re: fill_extraUpdatedCols is done in completely the wrong place
Previous Message Tom Lane 2020-05-18 17:22:38 Re: Vintage unused variables in pg_dump.c