From: | Andres Freund <andres(at)anarazel(dot)de> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | pgsql-hackers(at)lists(dot)postgresql(dot)org, Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>, Andrew Gierth <andrew(at)tao11(dot)riddles(dot)org(dot)uk>, Alexander Kuzmenkov <a(dot)kuzmenkov(at)postgrespro(dot)ru> |
Subject: | Re: Performance improvements for src/port/snprintf.c |
Date: | 2018-10-03 15:52:07 |
Message-ID: | 20181003155207.b3lqmovuv2c5c4id@alap3.anarazel.de |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hi,
On 2018-10-03 08:20:14 -0400, Tom Lane wrote:
> Andres Freund <andres(at)anarazel(dot)de> writes:
> >> While there might be value in implementing our own float printing code,
> >> I have a pretty hard time getting excited about the cost/benefit ratio
> >> of that. I think that what we probably really ought to do here is hack
> >> float4out/float8out to bypass the extra overhead, as in the 0002 patch
> >> below.
>
> > I'm thinking we should do a bit more than just that hack. I'm thinking
> > of something (barely tested) like
>
> Meh. The trouble with that is that it relies on the platform's snprintf,
> not sprintf, and that brings us right back into a world of portability
> hurt. I don't feel that the move to C99 gets us out of worrying about
> noncompliant snprintfs --- we're only requiring a C99 *compiler*, not
> libc. See buildfarm member gharial for a counterexample.
Oh, we could just use sprintf() and tell strfromd the buffer is large
enough. I only used snprintf because it seemed more symmetric, and
because I was at most 1/3 awake.
> I'm happy to look into whether using strfromd when available buys us
> anything over using sprintf. I'm not entirely convinced that it will,
> because of the need to ASCII-ize and de-ASCII-ize the precision, but
> it's worth checking.
It's definitely faster. It's not a full-blown format parser, so I guess
the cost of the conversion isn't too bad:
https://sourceware.org/git/?p=glibc.git;a=blob;f=stdlib/strfrom-skeleton.c;hb=HEAD#l68
CREATE TABLE somefloats(id serial, data1 float8, data2 float8, data3 float8);
INSERT INTO somefloats(data1, data2, data3) SELECT random(), random(), random() FROM generate_series(1, 10000000);
VACUUM FREEZE somefloats;
I'm comparing the times of:
COPY somefloats TO '/dev/null';
master (including your commit):
16177.202 ms
snprintf using sprintf via pg_double_to_string:
16195.787
snprintf using strfromd via pg_double_to_string:
14856.974 ms
float8out using sprintf via pg_double_to_string:
16176.169
float8out using strfromd via pg_double_to_string:
13532.698
FWIW, it seems that using a local buffer and than pstrdup'ing that in
float8out_internal is a bit faster, and would probably save a bit of
memory on average:
float8out using sprintf via pg_double_to_string, pstrdup:
15370.774
float8out using strfromd via pg_double_to_string, pstrdup:
13498.331
Greetings,
Andres Freund
From | Date | Subject | |
---|---|---|---|
Next Message | David Fetter | 2018-10-03 15:57:20 | Re: Early WIP/PoC for inlining CTEs |
Previous Message | Madeleine Thompson | 2018-10-03 14:58:26 | Re: BUG #15307: Low numerical precision of (Co-) Variance |