Re: Increased work_mem for "logical replication tablesync worker" only?

From: Dmitry Koterov <dmitry(dot)koterov(at)gmail(dot)com>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Increased work_mem for "logical replication tablesync worker" only?
Date: 2025-02-05 04:20:28
Message-ID: CA+CZih7=jCJswbxBF_DoP9fnasfrfTAmK+ZZNBTLYjCXRXR8+w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Only one index, the primary key.

ChatGPT tells that temp files may be used, when bytea columns (or heavily
toasted ones) are utilized, but it can’t explain, why. To me, it’s hard to
believe that on the destination, the COPY stream receiver puts almost all
of the received data to temp files, then waits for COPY to finish, and only
then moves that temp files to data area, because of the large transaction
or something. Would be cool to understand the reason.

On Mon, Feb 3, 2025 at 22:31 Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:

> On Mon, Feb 3, 2025 at 5:51 PM Dmitry Koterov <dmitry(dot)koterov(at)gmail(dot)com>
> wrote:
> >
> > Here is the list of tmp files:
> >
> > postgres(at)pg-101a:~/17/main/base/pgsql_tmp$ ls -la
> > total 5422297
> > drwx------ 2 postgres postgres 9 Feb 3 04:08 .
> > drwx------ 8 postgres postgres 8 Jan 29 01:27 ..
> > -rw------- 1 postgres postgres 1073741824 Feb 3 04:05 pgsql_tmp196534.0
> > -rw------- 1 postgres postgres 1073741824 Feb 3 04:05 pgsql_tmp196534.1
> > -rw------- 1 postgres postgres 1073741824 Feb 3 04:05 pgsql_tmp196534.2
> > -rw------- 1 postgres postgres 1073741824 Feb 3 04:06 pgsql_tmp196534.3
> > -rw------- 1 postgres postgres 1073741824 Feb 3 04:07 pgsql_tmp196534.4
> > -rw------- 1 postgres postgres 1073741824 Feb 3 04:08 pgsql_tmp196534.5
> > -rw------- 1 postgres postgres 819396608 Feb 3 04:08 pgsql_tmp196534.6
> >
> > With work_mem=4GB, all those files on the destination node seemed to
> appear immediately with 4GB size and keep growing since then, while COPY
> progresses on the source node (i.e. it looked like PG tried hard to utilize
> work_mem, but after reaching the limit, dumped everything to pgsql_tmp
> still).
> >
> > The table structure being copied (just 1 index there):
> >
> > CREATE TABLE mytable (
> > id bigint NOT NULL PRIMARY KEY,
> > snippet bytea,
> > title bytea,
> > updated_at timestamp with time zone,
> > rich_snippet bytea
> > );
> >
>
> Do you have any indexes on the table? I am not aware if a direct copy
> to table from tablesync uses tmp files.
>
> --
> With Regards,
> Amit Kapila.
>

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Kapila 2025-02-05 04:56:22 Re: Separate GUC for replication origins
Previous Message Andrei Lepikhov 2025-02-05 04:03:52 Re: [PATCH] Optionally record Plan IDs to track plan changes for a query