Re: Increased work_mem for "logical replication tablesync worker" only?

From: vignesh C <vignesh21(at)gmail(dot)com>
To: Dmitry Koterov <dmitry(dot)koterov(at)gmail(dot)com>
Cc: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Increased work_mem for "logical replication tablesync worker" only?
Date: 2025-02-07 10:31:13
Message-ID: CALDaNm1Vo7JB4b9BrpCb2a3jSmtz6kBsC6p8gkFJROYnKrKjUg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, 3 Feb 2025 at 17:51, Dmitry Koterov <dmitry(dot)koterov(at)gmail(dot)com> wrote:
>
> Here is the list of tmp files:
>
> postgres(at)pg-101a:~/17/main/base/pgsql_tmp$ ls -la
> total 5422297
> drwx------ 2 postgres postgres 9 Feb 3 04:08 .
> drwx------ 8 postgres postgres 8 Jan 29 01:27 ..
> -rw------- 1 postgres postgres 1073741824 Feb 3 04:05 pgsql_tmp196534.0
> -rw------- 1 postgres postgres 1073741824 Feb 3 04:05 pgsql_tmp196534.1
> -rw------- 1 postgres postgres 1073741824 Feb 3 04:05 pgsql_tmp196534.2
> -rw------- 1 postgres postgres 1073741824 Feb 3 04:06 pgsql_tmp196534.3
> -rw------- 1 postgres postgres 1073741824 Feb 3 04:07 pgsql_tmp196534.4
> -rw------- 1 postgres postgres 1073741824 Feb 3 04:08 pgsql_tmp196534.5
> -rw------- 1 postgres postgres 819396608 Feb 3 04:08 pgsql_tmp196534.6
>
> With work_mem=4GB, all those files on the destination node seemed to appear immediately with 4GB size and keep growing since then, while COPY progresses on the source node (i.e. it looked like PG tried hard to utilize work_mem, but after reaching the limit, dumped everything to pgsql_tmp still).
>
> The table structure being copied (just 1 index there):
>
> CREATE TABLE mytable (
> id bigint NOT NULL PRIMARY KEY,
> snippet bytea,
> title bytea,
> updated_at timestamp with time zone,
> rich_snippet bytea
> );

I prepare a logical replication with the following:
CREATE TABLE mytable (
id bigint NOT NULL PRIMARY KEY,
snippet bytea default decode(repeat('1234567890',10000) ,'hex'),
title bytea default decode(repeat('1234567890',10000),'hex'),
updated_at timestamp with time zone default 'now',
rich_snippet bytea default decode(repeat('1234567890',10000),'hex');

-- Specify the data only for first column, the rest of the column will
use default value
insert into mytable values(generate_series(1,100000));

These did not create temp files, I had used a low work_mem
configuration i.e. 64kB. I have enabled "log_temp_files = 0" to
confirm that no temp files were created.
Generally when we use ORDER BY, DISTINCT, hash joins, hash-based
aggregation, memoize nodes, hash-based processing, merge joins, these
temp files will be created.
Ex: The following creates temp files:
select * from mytable order by 2,3,4,5;
2025-02-07 15:10:25.409 IST [586494] LOG: temporary file: path
"base/pgsql_tmp/pgsql_tmp586494.0", size 65265664
2025-02-07 15:10:25.464 IST [586495] LOG: temporary file: path
"base/pgsql_tmp/pgsql_tmp586495.0", size 60481536
2025-02-07 15:10:25.514 IST [586492] LOG: temporary file: path
"base/pgsql_tmp/pgsql_tmp586492.0", size 54730752

Does the table have any triggers that could invoke another statement?
Could you provide more details or simulate a scenario using a table
like the one above with insert operations? That would be really
helpful.

Regards,
Vignesh

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Andrey Borodin 2025-02-07 10:40:44 Re: Using Expanded Objects other than Arrays from plpgsql
Previous Message Alvaro Herrera 2025-02-07 09:31:44 Re: Modern SHA2- based password hashes for pgcrypto