Re: BUG #16967: Extremely slow update statement in trigger

From: David Fetter <david(at)fetter(dot)org>
To: pgsql-bugs(at)lists(dot)postgresql(dot)org
Subject: Re: BUG #16967: Extremely slow update statement in trigger
Date: 2021-04-23 01:50:12
Message-ID: 20210423015012.GI2586@fetter.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

On Fri, Apr 16, 2021 at 04:52:15PM +0000, PG Bug reporting form wrote:
> The following bug has been logged on the website:
>
> Bug reference: 16967
> Logged by: Nicolas Burri
> Email address: postgres(at)burri(dot)li
> PostgreSQL version: 13.2
> Operating system: Linux &amp; Windows
> Description:
>
> Hi,
> During a data migration our production system spent several hours in a
> simple update statement of a table with very few columns and about 200k
> records. We managed to reproduce the problem and a minimal setup for
> reproduction can be found here:
> https://www.dropbox.com/s/mxjavpl43s48hdg/bug_report.sql. The bug seems to
> be related to query plan caching and can be reproduced on postrgres 11.11
> and 13.2 (I have not run tests with other versions)
>
> Description:
> We use a statement level trigger to log changes to records in a certain
> table. On every update, the trigger calls a function comparing “old table”
> and “new table”. For every record changed by the statement, the function
> inserts a new record in a “log” table. In the following scenario, execution
> times explode:
> First, execute an update statement that affects no rows. This query is fast
> and completes within milliseconds. Then, execute a second update statement
> that affects a lot of records. At 200k records, this query runs for more
> than 4h on my workstation. If we call “discard plans” before executing the
> second update statement, or if we do not execute the first statement at all,
> the update of all rows completes within about 1 second.
>
> Thanks and best Regards
> Nicolas
>

I'm including what I found on dropbox below. In future, please include
the entire content in emails so it stays independent of sites that
might not still be there when people look back.

I've managed to reproduce the extremely long execution, which I
haven't yet tried to wait out.

I did find something interesting here's the EXPLAIN for the update
that's intended to change everything in the table. Note that the
estimated rows is 1:

shackle(at)[local]:5414/slow_stmt_triggers(14devel)(110490) # explain update demo set type = 'black' where type='blue';
QUERY PLAN
══════════════════════════════════════════════════════════════
Update on demo (cost=0.00..3976.35 rows=0 width=0)
-> Seq Scan on demo (cost=0.00..3976.35 rows=1 width=34)
Filter: ((type)::text = 'blue'::text)
(3 rows)

Time: 0.707 ms

Next, I run ANALYZE on that table to collect statistics, as people
generally do just after a bulk load to ensure that the planner is
acting on reasonably accurate statistics.

shackle(at)[local]:5414/slow_stmt_triggers(14devel)(110490) # analyze demo;
ANALYZE
Time: 35.511 ms

...and run the EXPLAIN again:

shackle(at)[local]:5414/slow_stmt_triggers(14devel)(110490) # explain update demo set type = 'black' where type='blue';
QUERY PLAN
═══════════════════════════════════════════════════════════════════
Update on demo (cost=0.00..3582.00 rows=0 width=0)
-> Seq Scan on demo (cost=0.00..3582.00 rows=200000 width=34)
Filter: ((type)::text = 'blue'::text)
(3 rows)

Time: 0.772 ms

Now the row estimate is 200_000, which is to say exactly what's in
there.

Now, I run the whole-table UPDATE, and:

shackle(at)[local]:5414/slow_stmt_triggers(14devel)(110490) # update demo set type = 'black' where type='blue';
UPDATE 200000
Time: 2208.754 ms (00:02.209)

Perhaps my machine is a little slower than the one the original test
was run on, but updating a little under a hundred thousand rows a
second, complete with an INSERT of another hundred thousand rows in a
second, doesn't seem terrible.

tl;dr: running ANALYZE after a bulk load is a good idea, and I don't
see a bug here.

Original self-contained repro:

-----8<-------------------------------------------------------------------------
-- on this table we will execute a simple update statement
create table demo (
id int not null primary key,
type varchar(5)
);

-- this is our log table that has a record written to for every updated record in the "demo" table.
create table log (
data_after text,
data_before text
);

-- This function is executed in an update trigger (defined below) attached to the "demo" table. The function joins "before" and "after" data provided by a statement level trigger.
-- For all records containing a changed value, both "before" and "after" data are written to the "log" table.
CREATE OR REPLACE FUNCTION log_function()
RETURNS TRIGGER
AS
$$
DECLARE
begin
INSERT
INTO log
select new_data.row AS data_after,
old_data.row AS data_before
FROM (SELECT row_number() OVER () as id,
ROW (new_table.*)
FROM new_table) new_data
JOIN (
SELECT row_number() OVER () id,
ROW (old_table.*)
FROM old_table) old_data
ON (new_data.id = old_data.id)
WHERE new_data.row IS DISTINCT FROM old_data.row;
RETURN NULL;
end;
$$
LANGUAGE plpgsql;

-- This attaches a update trigger to the "demo" table. On "update", log_function() is called
CREATE TRIGGER log_trigger AFTER UPDATE ON demo REFERENCING OLD TABLE AS old_table NEW TABLE AS new_table FOR EACH STATEMENT EXECUTE PROCEDURE log_function();

truncate demo; -- just to make sure all tables are empty
truncate log; -- just to make sure all tables are empty

insert into demo (id, type) select i, 'blue' from generate_series(1,200000) s(i); -- generate 200k sample records in the "demo" table. The log table is still empty after the execution.

update demo set type = 'black' where type='red'; -- No rows are updated and the query executes in 16ms. After the execution of this query, "log" is still empty, as no records have changed.

update demo set type = 'black' where type='blue'; -- If run after the query above, this query does not stop within a reasonable amount of time (hours). However, if the black -> red update statement is not called first, execution of this query takes less than 1.2s. It updates all 200k records in the demo table and creates 200k new records in the "log" table.

-- Additional information: If we call "discard plans" between the execution of the two update statements above. The second one will also complete in 1.2s.
truncate demo; -- just to make sure all tables are empty
truncate log; -- just to make sure all tables are empty
insert into demo (id, type) select i, 'blue' from generate_series(1,200000) s(i); -- generate 200k sample records in the "demo" table. The log table is still empty after the execution.
update demo set type = 'black' where type='red'; -- No rows are updated and the query executes in 16ms. After the execution of this query, "log" is still empty, as no records have changed.
discard plans;
update demo set type = 'black' where type='blue';

---- just for checking results. Not part of the reproduction
select * from demo;
select * from log;

--
David Fetter <david(at)fetter(dot)org> http://fetter.org/
Phone: +1 415 235 3778

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Tom Lane 2021-04-23 02:58:42 Re: BUG #16967: Extremely slow update statement in trigger
Previous Message Amit Langote 2021-04-23 01:05:46 Re: posgres 12 bug (partitioned table)