From: | Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com> |
---|---|
To: | Andres Freund <andres(at)anarazel(dot)de>, Andrew Dunstan <andrew(dot)dunstan(at)2ndquadrant(dot)com> |
Cc: | Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>, David Rowley <david(dot)rowley(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: ALTER TABLE ADD COLUMN fast default |
Date: | 2018-02-03 00:44:30 |
Message-ID: | e198660e-ca2d-a2ea-6210-5af376a345fe@2ndquadrant.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On 02/01/2018 02:54 PM, Andres Freund wrote:
> Hi,
>
> On 2018-01-26 10:53:12 +1030, Andrew Dunstan wrote:
>> Yeah, thanks. revised patch attached
>
> Given that this patch touches code that's a huge bottleneck in a lot of
> cases, I think this needs benchmarks that heavily exercises tuple
> deforming.
>
That's a reasonable request, I guess, and I tried to collect such data
today. I measured two cases:
1) Table with 1000 columns and 64 rows, when there were no ALTER TABLE
adding columns with 'fast' defaults. This is meant to measure the best
case, with minimal impact from the patch. See the create.sql script
attached to this message, and the query.sql which was used for tests
using pgbench like this:
pgbench -n -f q.sql -T 15 test
after 100 runs, the results (tps) look like this:
min max median
--------------------------------------
master 1827 1873 1860
patched 2023 2066 2056
That is, the patch apparently improves the performance by about 10%
(according to perf profiles this is due to slot_deform_tuple getting
cheaper).
So this case seems fine.
2) Table with 64 rows and 1000 columns, all added by ALTER TABLE with
fast default without rewrite. See create-alter.sql.
Using the same query.sql as before, this shold significant drop to only
about 40 tps (from ~2000 tps for master). The profiles something like this:
+ 98.87% 98.87% postgres [.] slot_getmissingattrs
+ 98.77% 0.00% postgres [.] PortalRun
+ 98.77% 0.00% postgres [.] ExecAgg
+ 98.74% 0.01% postgres [.] ExecInterpExpr
which is kinda understandable, although the 2000 to 40 tps seems like a
pretty significant drop. But then again, this case is constructed like a
fairly extreme corner case.
However, there seems to be some sort of bug, because when I did VACUUM
FULL - ideally this would replace the "missing" default values with
actual values stored in the heap rows, eliminating the performance
impact. But the default values got lost and replaced by NULL values,
which seems like a clear data loss scenario.
I'm not quite sure what's wrong, but try this:
\i create-alter.sql
-- this returns 64, which is correct
SELECT COUNT(*) FROM t;
-- this actually retuns 64 rows with values "1"
SELECT c1000 FROM t;
-- this returns 63, which is incorrect (should be 64)
SELECT count(c1000) FROM t;
VACUUM FULL t;
-- suddenly we only get NULL values for all 64 rows
SELECT c1000 FROM t;
-- and now we got 0 (instead of 64)
SELECT count(c1000) FROM t;
regard
--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Attachment | Content-Type | Size |
---|---|---|
create.sql | application/sql | 199.9 KB |
create-alter.sql | application/sql | 43.9 KB |
query.sql | application/sql | 26 bytes |
From | Date | Subject | |
---|---|---|---|
Next Message | David G. Johnston | 2018-02-03 00:48:05 | Re: [HACKERS] proposal: schema variables |
Previous Message | Chapman Flack | 2018-02-03 00:37:08 | pie-in-sky idea: 'sensitive' function parameters |