From: | Dmitry Koterov <dmitry(dot)koterov(at)gmail(dot)com> |
---|---|
To: | Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> |
Cc: | "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Increased work_mem for "logical replication tablesync worker" only? |
Date: | 2025-02-03 12:49:32 |
Message-ID: | CA+CZih5tbxFtO3eTy6G8w2+MuFBJ1Ex2JsLUURjsr6K_0xysVw@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
What's also suspicious is that on the destination node, after copying
finishes,
# explain select 1 from mytable;
Seq Scan on mytable (cost=0.00..6821514.12 rows=250410012 width=4)
# SELECT relname, n_live_tup FROM pg_stat_user_tables WHERE relname =
'mytable';
relname | estimated_rows
-------------+----------------
mytable | 150342468
Notice the discrepancy between how many rows EXPLAIN thinks there is in the
table, and how many rows pg_stat_user_tables thinks about it (250M vs.
150M).
On the source node, those 2 numbers are almost the same. It's only on the
destination they are different, right after the copying. (There are not a
lot of writes to this table happening on the source happening while copying
BTW.)
Maybe that could hint on why temp files are used?
On Mon, Feb 3, 2025 at 4:21 AM Dmitry Koterov <dmitry(dot)koterov(at)gmail(dot)com>
wrote:
> Here is the list of tmp files:
>
> postgres(at)pg-101a:~/17/main/base/pgsql_tmp$ ls -la
> total 5422297
> drwx------ 2 postgres postgres 9 Feb 3 04:08 .
> drwx------ 8 postgres postgres 8 Jan 29 01:27 ..
> -rw------- 1 postgres postgres 1073741824 Feb 3 04:05 pgsql_tmp196534.0
> -rw------- 1 postgres postgres 1073741824 Feb 3 04:05 pgsql_tmp196534.1
> -rw------- 1 postgres postgres 1073741824 Feb 3 04:05 pgsql_tmp196534.2
> -rw------- 1 postgres postgres 1073741824 Feb 3 04:06 pgsql_tmp196534.3
> -rw------- 1 postgres postgres 1073741824 Feb 3 04:07 pgsql_tmp196534.4
> -rw------- 1 postgres postgres 1073741824 Feb 3 04:08 pgsql_tmp196534.5
> -rw------- 1 postgres postgres 819396608 Feb 3 04:08 pgsql_tmp196534.6
>
> With work_mem=4GB, all those files *on the destination node* seemed to
> appear immediately with 4GB size and keep growing since then, while COPY
> progresses on the source node (i.e. it looked like PG tried hard to utilize
> work_mem, but after reaching the limit, dumped everything to pgsql_tmp
> still).
>
> The table structure being copied (just 1 index there):
>
> CREATE TABLE mytable (
> id bigint NOT NULL PRIMARY KEY,
> snippet bytea,
> title bytea,
> updated_at timestamp with time zone,
> rich_snippet bytea
> );
>
> Directories sizes on the destination node while tablesync is working (it's
> copied in to an almost empty database):
>
> $ watch du -sh 17/main/base/* 17/main/pg_wal
> 2.2M 17/main/base/1
> 14G 17/main/base/16385
> 2.3M 17/main/base/16387
> 2.2M 17/main/base/4
> 2.3M 17/main/base/5
> 12G 17/main/base/pgsql_tmp
> 6.3G 17/main/pg_wal
>
> So the question, why does it use temp files. Why not just writes directly
> to WAL+data.
>
>
> On Mon, Feb 3, 2025 at 3:04 AM Dmitry Koterov <dmitry(dot)koterov(at)gmail(dot)com>
> wrote:
>
>> Hi.
>>
>> 1. Those are temp files on the destination node (where the logical
>> subscription exists and tablesync worker runs), not on the source. On the
>> source, it’s all clear.
>>
>> 2. No “spill” suffix/substring in the file names. I tried to look at the
>> content of these temp files, I I saw some text fragments from the original
>> table’s text column there. I.e. it looks like for some reason, the stream
>> received from the source node’s COPY command goes to that temp files (at
>> least partially).
>>
>> 3. I made several more experiments, increasing work_mem to several GB
>> (for the role which tablesync worker uses when copying) definitely helps
>> with temp files.
>>
>> Thanks!
>>
>> On Sun, Feb 2, 2025 at 19:10 Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
>>
>>> On Sun, Feb 2, 2025 at 5:13 PM Dmitry Koterov <dmitry(dot)koterov(at)gmail(dot)com>
>>> wrote:
>>> >
>>> > Trying to monitor perf during the initial tablesync phase (COPY) right
>>> after CREATE SUBSCRIPTION. I noticed that the size of
>>> 17/main/base/pgsql_tmp on the destination node grows (tens of gigabytes) as
>>> the COPY command (running internally on the publisher) progresses. Then in
>>> the end (when its "EXPLAIN SELECT 1 FROM tbl" on the destination shows the
>>> approximate number of rows equals to the number of rows on the source node)
>>> it hangs for several minutes, and then 17/main/base/pgsql_tmp empties, and
>>> the subscription progresses.
>>> >
>>> > It seems like if I increase work_mem to several GB, then the growth of
>>> 17/main/base/pgsql_tmp becomes less significant.
>>> >
>>> > Questions:
>>> >
>>> > 1. Are there some diagnostics commands that would allow me to figure
>>> out what is in those tmp files? Why does the subscriber create those tmp
>>> files and not just write directly to the data files and WAL? (The table has
>>> 2 bytea columns, i.e. it's TOASTed for sure.)
>>> >
>>>
>>> We do write spill files (ending with '.spill') if the changes are
>>> large. Can you please share the name of tmp files to avoid any
>>> assumptions?
>>>
>>> --
>>> With Regards,
>>> Amit Kapila.
>>>
>>
From | Date | Subject | |
---|---|---|---|
Next Message | Aleksander Alekseev | 2025-02-03 12:53:44 | Re: new commitfest transition guidance |
Previous Message | jian he | 2025-02-03 12:49:15 | Re: Non-text mode for pg_dumpall |