Logical replication issue with row level trigger

From: Aleš Zelený <zeleny(dot)ales(at)gmail(dot)com>
To: pgsql-general(at)lists(dot)postgresql(dot)org
Subject: Logical replication issue with row level trigger
Date: 2019-01-13 19:41:46
Message-ID: CAODqTUZsf8BasR-3SY3V7jeDVj-v4-komPpsR-SFLfp78dDiAQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Hello,

I've found an issue with logical replication causing
1) large memory allocation by wal sender process (RSS 60GB)
2) large amount of "xid" files in $PGDATA/pg_replslot/<repl_slot_name>
directory - seems to be amount of rows handled by trigger + few more

Having several millions of files in a single directory for replication slot
makes it hard to work on (tested ext4 and zfs).
Regarding the memory - such large allocation was no expected and as result
swap was used with all performance impact expected.

Environment and conditions to reproduce:

PostgreSQL 10.6 (Ubuntu 10.6-1.pgdg16.04+1) on x86_64-pc-linux-gnu,
compiled by gcc (Ubuntu 5.4.0-6ubuntu1~16.04.10) 5.4.0 20160609, 64-bit

create trigger function with:
1) DML modifying data in another table
2) use EXCEPTION clause (implicit subtransaction [
https://www.postgresql.org/docs/10/plpgsql-structure.html])

Note, that is does not matter, whether tables used for testcase are
replicated or no (this is what I use in testcase).

# publication side
pg_createcluster -d /pgsql/pgclusterzfs/i_pub/10/ipub -p 5532 10 ipub -- -k
pg_ctlcluster 10 ipub start
psql -p 5532 -c "alter system set wal_level=logical;"
pg_ctlcluster 10 ipub restart

psql -p 5532 <<EOF
create table foo( foo_id bigserial, foo_val text, PRIMARY KEY (foo_id));
create table bar( bar_id bigserial, bar_val bigint, PRIMARY KEY (bar_id));
create publication pub_test FOR TABLE foo, bar;
EOF

# subscription side
pg_createcluster -d /pgsql/pgclusterzfs/i_sub/10/isub -p 5533 10 isub -- -k
pg_ctlcluster 10 isub start
psql -p 5533 <<EOF
create table foo( foo_id bigint, foo_val text, PRIMARY KEY (foo_id));
create table bar( bar_id bigint, bar_val bigint, PRIMARY KEY (bar_id));
create subscription sub_test CONNECTION 'port=5532 dbname=postgres'
PUBLICATION pub_test;
EOF

# publication side
psql -p 5532 <<EOF
create table local_bar( bar_id bigserial, bar_val bigint, PRIMARY KEY
(bar_id));

create table local_bar_aud (
local_bar_aud_id bigserial
, bar_id bigint not null
, bar_val bigint
, aud_op text not null
, aud_time timestamptz not null
, aud_current_user text not null
, aud_session_user text not null
, aud_pid int
, PRIMARY KEY (local_bar_aud_id)
);
EOF

Trigger function :

CREATE OR REPLACE FUNCTION public.local_bar_aud()
RETURNS trigger
LANGUAGE plpgsql
SECURITY DEFINER
SET search_path TO 'pg_catalog', 'public'
AS $function$
BEGIN
IF ( TG_OP = 'INSERT' ) THEN
INSERT INTO public.local_bar_aud (
bar_id,
bar_val,
aud_op,
aud_time,
aud_current_user,
aud_session_user,
aud_pid
) VALUES (
NEW.bar_id,
NEW.bar_val,
TG_OP,
now(),
current_user::text,
session_user::text,
pg_backend_pid()
);
RETURN NEW;
END IF;
EXCEPTION
WHEN OTHERS THEN
RAISE EXCEPTION 'ERROR [OTHER] from trigger function.';
RETURN NULL;
END;
$function$;

psql -p 5532 <<EOF
CREATE TRIGGER local_bar_audit_trg AFTER INSERT ON local_bar FOR EACH ROW
EXECUTE PROCEDURE local_bar_aud();

insert into local_bar (bar_val) select i from generate_series(1, 1E6) gs(i);
EOF

Using 1 mio rows is enough to show that wal sended allocates ~4.5GB in RSS
and create 1mio xid files in the $PGDATA/pg_replslot/sub_test directory.

If the exception clause is removed or the trigger did not manipulate with
other table data (e.g. setting a new vale if used before insert), there is
no memory issue at all.

Similar impact to wal sender memory allocation can be also simulated by
simple script:

begin transaction;

SAVEPOINT my_savepoint; INSERT INTO public.bar (bar_id, bar_val) VALUES
(49471972, 1); RELEASE SAVEPOINT my_savepoint;
SAVEPOINT my_savepoint; INSERT INTO public.bar (bar_id, bar_val) VALUES
(49471973, 2); RELEASE SAVEPOINT my_savepoint;
SAVEPOINT my_savepoint; INSERT INTO public.bar (bar_id, bar_val) VALUES
(49471974, 3); RELEASE SAVEPOINT my_savepoint;
.... repeat 1 mio rows ....
commit transaction;

I rewrite the original trigger to most minimalistic version I was able to
reproduce the issue, so the exception here is just to be able demonstrate
my finding.

I'll ask developers to remove exception handling from triggers if possible
(or use statement triggers, what works well if I reworked this "auditing"
example above).

Questions:
Is there a way how to mitigate such issue to stop a transaction fast enough
in case it might result in huge memory allocation of wal sender processes
(I even run out of 300GB swap, since I have multiple subscribers and had to
drop subscriptions - even deleteing the "xid" files lasts severla hours...)
?

Is the above a bug, or just a trigger function programming issue (I haven't
found in docs reason not to have exception caluse in trigger function [yes,
there is noted exception clause impact to performance])?

Thanks for advises/hints

Ales Zeleny

Browse pgsql-general by date

  From Date Subject
Next Message Zach van Rijn 2019-01-13 19:42:33 Re: Static PostgreSQL Binaries (Linux + Windows)
Previous Message Michel Pelletier 2019-01-13 18:51:45 Re: Question about MemoryContextRegisterResetCallback