Optimize commit performance with a large number of 'on commit delete rows' temp tables

From: feichanghong <feichanghong(at)qq(dot)com>
To: pgsql-hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Optimize commit performance with a large number of 'on commit delete rows' temp tables
Date: 2024-07-05 15:19:22
Message-ID: tencent_924E990F0493010E2C8404A5D677C70C9707@qq.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi hackers,
&nbsp;&nbsp;
# Background

PostgreSQL maintains a list of temporary tables for 'on commit
drop/delete rows' via an on_commits list in the session. Once a
transaction accesses a temp table or namespace, the
XACT_FLAGS_ACCESSEDTEMPNAMESPACE flag is set. Before committing, the
PreCommit_on_commit_actions function truncates all 'commit delete
rows' temp tables, even those not accessed in the current transaction.
Commit performance can degrade if there are many such temp tables.

In practice, users created many 'commit delete rows' temp tables in a
session, but each transaction only accessed a few. With varied access
frequency, users were reluctant to change to 'on commit drop'.

Below is an example showing the effect of the number of temp tables
on commit performance:
```
-- 100
DO $$
DECLARE
&nbsp; &nbsp; begin
&nbsp; &nbsp; &nbsp; &nbsp; FOR i IN 1..100 LOOP
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; EXECUTE format('CREATE TEMP TABLE temp_table_%s (id int) on commit delete ROWS', i) ;
&nbsp; &nbsp; &nbsp; &nbsp; END LOOP;
&nbsp; &nbsp; END;
$$;
postgres=# insert into temp_table_1 select 1;
INSERT 0 1
Time: 1.325 ms
postgres=# insert into temp_table_1 select 1;
INSERT 0 1
Time: 1.330 ms
```

```
-- 1000
DO $$
DECLARE
&nbsp; &nbsp; begin
&nbsp; &nbsp; &nbsp; &nbsp; FOR i IN 1..1000 LOOP
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; EXECUTE format('CREATE TEMP TABLE temp_table_%s (id int) on commit delete ROWS', i) ;
&nbsp; &nbsp; &nbsp; &nbsp; END LOOP;
&nbsp; &nbsp; END;
$$;
postgres=# insert into temp_table_1 select 1;
INSERT 0 1
Time: 10.939 ms
postgres=# insert into temp_table_1 select 1;
INSERT 0 1
Time: 10.955 ms
```

```
-- 10000
DO $$
DECLARE
&nbsp; &nbsp; begin
&nbsp; &nbsp; &nbsp; &nbsp; FOR i IN 1..10000 LOOP
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; EXECUTE format('CREATE TEMP TABLE temp_table_%s (id int) on commit delete ROWS', i) ;
&nbsp; &nbsp; &nbsp; &nbsp; END LOOP;
&nbsp; &nbsp; END;
$$;
postgres=# insert into temp_table_1 select 1;
INSERT 0 1
Time: 110.253 ms
postgres=# insert into temp_table_1 select 1;
INSERT 0 1
Time: 175.875 ms
```

# Solution

An intuitive solution is to truncate only the temp tables that
the current process has accessed upon transaction commit.

In the attached patch (based on HEAD):
- A Bloom filter (can also be a list or hash table) maintains
the temp tables accessed by the current transaction.
- Only temp tables filtered through the Bloom filter need
truncation. False positives may occur, but they are
acceptable.
- The Bloom filter is reset at the start of the transaction,
indicating no temp tables have been accessed by the
current transaction yet.

After optimization, the performance for the same case is as
follows:
```
-- 100
DO $$
DECLARE
&nbsp; &nbsp; begin
&nbsp; &nbsp; &nbsp; &nbsp; FOR i IN 1..100 LOOP
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; EXECUTE format('CREATE TEMP TABLE temp_table_%s (id int) on commit delete ROWS', i) ;
&nbsp; &nbsp; &nbsp; &nbsp; END LOOP;
&nbsp; &nbsp; END;
$$;
postgres=# insert into temp_table_1 select 1;
INSERT 0 1
Time: 0.447 ms
postgres=# insert into temp_table_1 select 1;
INSERT 0 1
Time: 0.453 ms
```

```
-- 1000
DO $$
DECLARE
&nbsp; &nbsp; begin
&nbsp; &nbsp; &nbsp; &nbsp; FOR i IN 1..1000 LOOP
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; EXECUTE format('CREATE TEMP TABLE temp_table_%s (id int) on commit delete ROWS', i) ;
&nbsp; &nbsp; &nbsp; &nbsp; END LOOP;
&nbsp; &nbsp; END;
$$;
postgres=# insert into temp_table_1 select 1;
INSERT 0 1
Time: 0.531 ms
postgres=# insert into temp_table_1 select 1;
INSERT 0 1
Time: 0.567 ms
```

```
-- 10000
DO $$
DECLARE
&nbsp; &nbsp; begin
&nbsp; &nbsp; &nbsp; &nbsp; FOR i IN 1..10000 LOOP
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; EXECUTE format('CREATE TEMP TABLE temp_table_%s (id int) on commit delete ROWS', i) ;
&nbsp; &nbsp; &nbsp; &nbsp; END LOOP;
&nbsp; &nbsp; END;
$$;
postgres=# insert into temp_table_1 select 1;
INSERT 0 1
Time: 1.370 ms
postgres=# insert into temp_table_1 select 1;
INSERT 0 1
Time: 1.362 ms
```

Hoping for some suggestions from hackers.

Best Regards,
Fei Changhong

&nbsp;

Attachment Content-Type Size
v1-0000-Optimize-commit-with-temp-tables.patch application/octet-stream 8.2 KB

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Dean Rasheed 2024-07-05 15:41:33 Re: Optimize numeric multiplication for one and two base-NBASE digit multiplicands.
Previous Message Nathan Bossart 2024-07-05 15:12:34 Re: 回复:Re: speed up pg_upgrade with large number of tables