Re: Increased work_mem for "logical replication tablesync worker" only?

From: Dmitry Koterov <dmitry(dot)koterov(at)gmail(dot)com>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Increased work_mem for "logical replication tablesync worker" only?
Date: 2025-02-03 11:04:03
Message-ID: CA+CZih58LboKrnEWB6Xhsj82UaqZ56Mwc0ONuHwvVgcwKSqXbQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi.

1. Those are temp files on the destination node (where the logical
subscription exists and tablesync worker runs), not on the source. On the
source, it’s all clear.

2. No “spill” suffix/substring in the file names. I tried to look at the
content of these temp files, I I saw some text fragments from the original
table’s text column there. I.e. it looks like for some reason, the stream
received from the source node’s COPY command goes to that temp files (at
least partially).

3. I made several more experiments, increasing work_mem to several GB (for
the role which tablesync worker uses when copying) definitely helps with
temp files.

Thanks!

On Sun, Feb 2, 2025 at 19:10 Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:

> On Sun, Feb 2, 2025 at 5:13 PM Dmitry Koterov <dmitry(dot)koterov(at)gmail(dot)com>
> wrote:
> >
> > Trying to monitor perf during the initial tablesync phase (COPY) right
> after CREATE SUBSCRIPTION. I noticed that the size of
> 17/main/base/pgsql_tmp on the destination node grows (tens of gigabytes) as
> the COPY command (running internally on the publisher) progresses. Then in
> the end (when its "EXPLAIN SELECT 1 FROM tbl" on the destination shows the
> approximate number of rows equals to the number of rows on the source node)
> it hangs for several minutes, and then 17/main/base/pgsql_tmp empties, and
> the subscription progresses.
> >
> > It seems like if I increase work_mem to several GB, then the growth of
> 17/main/base/pgsql_tmp becomes less significant.
> >
> > Questions:
> >
> > 1. Are there some diagnostics commands that would allow me to figure out
> what is in those tmp files? Why does the subscriber create those tmp files
> and not just write directly to the data files and WAL? (The table has 2
> bytea columns, i.e. it's TOASTed for sure.)
> >
>
> We do write spill files (ending with '.spill') if the changes are
> large. Can you please share the name of tmp files to avoid any
> assumptions?
>
> --
> With Regards,
> Amit Kapila.
>

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Eisentraut 2025-02-03 11:08:52 Re: Index AM API cleanup
Previous Message Bertrand Drouvot 2025-02-03 10:37:54 Re: Show WAL write and fsync stats in pg_stat_io