From: | Andres Freund <andres(at)anarazel(dot)de> |
---|---|
To: | pgsql-hackers(at)postgresql(dot)org |
Subject: | WIP: Make timestamptz_out less slow. |
Date: | 2015-07-27 21:17:42 |
Message-ID: | 20150727211742.GA27681@alap3.anarazel.de |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hi,
I recently once more noticed that timestamptz_out is really, really
slow. To quantify that, I created a bunch of similar sized tables:
CREATE TABLE tbl_timestamp AS SELECT NOW() FROM generate_series(1, 100000) a, generate_series(1, 100) b;
CREATE TABLE tbl_int8 AS SELECT 1::bigint FROM generate_series(1, 100000) a, generate_series(1, 100) b;
CREATE TABLE tbl_bytea AS SELECT ' '::bytea FROM generate_series(1, 100000) a, generate_series(1, 100) b;
These all end up being 346MB large.
COPY tbl_bytea TO '/dev/null';
Time: 1173.484 ms
COPY tbl_int8 TO '/dev/null';
Time: 1030.756 ms
COPY tbl_timestamp TO '/dev/null';
Time: 6598.030
(all best of three)
Yes, timestamp outputs more data as a whole, but being 5 times as slow
is still pretty darn insane. To make sure that's not the cause, here's
another table:
CREATE TABLE tbl_timestamptext AS SELECT NOW()::text FROM generate_series(1, 100000) a, generate_series(1, 100) b;
COPY tbl_timestamptext TO '/dev/null';
Time: 1449.554 ms
So it's really just the timestamp code.
Profiling it shows:
Overhead Command Shared Object Symbol
- 38.33% postgres_stock libc-2.19.so [.] vfprintf
- 97.92% vfprintf
_IO_vsprintf
- sprintf
+ 70.25% EncodeDateTime
+ 29.75% AppendSeconds.constprop.10
+ 1.11% _IO_default_xsputn
- 8.22% postgres_stock libc-2.19.so [.] _IO_default_xsputn
- 99.43% _IO_default_xsputn
- 90.09% vfprintf
_IO_vsprintf
- sprintf
+ 74.15% EncodeDateTime
+ 25.85% AppendSeconds.constprop.10
+ 9.72% _IO_padn
+ 0.57% vfprintf
+ 7.76% postgres_stock postgres_stock [.] CopyOneRowTo
So nearly all the time is spent somewhere inside the sprintf calls. Not
nice.
The only thing I could come up to make the sprintfs cheaper was to
combine them into one and remove some of the width specifiers that
aren't needed. That doesn't buy us very much.
I then proceeded to replace the sprintf call with hand-rolled
conversions. And wow, the benefit is far bigger than I'd assumed:
postgres[7236][1]=# COPY tbl_timestamp TO '/dev/null';
Time: 2430.521 ms
So, by hand-rolling the ISO conversion in EncodeDateTime() we got a
~250% performance improvement. I'd say that's worthwhile.
The attached patch shows what I did. While there's some polishing
possible, as a whole, it's pretty ugly. But I think timestamp data is so
common that it's worth the effort.
Does anybody have a fundamentally nicer idea than the attached to
improvide this?
Greetings,
Andres Freund
Attachment | Content-Type | Size |
---|---|---|
0001-WIP-faster-timestamp-tz-_out.patch | text/x-patch | 15.0 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Dean Rasheed | 2015-07-27 21:28:32 | Re: A little RLS oversight? |
Previous Message | Jim Nasby | 2015-07-27 21:15:19 | Re: proposal: multiple psql option -c |