From: | Marco Slot <marco(dot)slot(at)gmail(dot)com> |
---|---|
To: | Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> |
Cc: | Önder Kalacı <onderkalaci(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Subject: | Re: [PATCH] Use indexes on the subscriber when REPLICA IDENTITY is full on the publisher |
Date: | 2022-07-20 07:15:12 |
Message-ID: | CAFMSG9G0Pr=feCDwxJGF2=xBSG69iXJZLDiFKyr3hD7bYhb33Q@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Mon, Jul 18, 2022 at 8:29 AM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> IIUC, this proposal is to optimize cases where users can't have a
> unique/primary key for a relation on the subscriber and those
> relations receive lots of updates or deletes?
I think this patch optimizes for all non-trivial cases of update/delete
replication (e.g. >1000 rows in the table, >1000 rows per hour updated)
without a primary key. For instance, it's quite common to have a large
append-mostly events table without a primary key (e.g. because of
partitioning, or insertion speed), which will still have occasional batch
updates/deletes.
Imagine an update of a table or partition with 1 million rows and a typical
scan speed of 1M rows/sec. An update on the whole table takes maybe 1-2
seconds. Replicating the update using a sequential scan per row can take on
the order of ~12 days ≈ 1M seconds.
The current implementation makes using REPLICA IDENTITY FULL a huge
liability/ impractical for scenarios where you want to replicate an
arbitrary set of user-defined tables, such as upgrades, migrations, shard
moves. We generally recommend users to tolerate update/delete errors in
such scenarios.
If the apply worker can use an index, the data migration tool can
tactically create one on a high cardinality column, which would practically
always be better than doing a sequential scan for non-trivial workloads.
cheers,
Marco
From | Date | Subject | |
---|---|---|---|
Next Message | Kyotaro Horiguchi | 2022-07-20 07:16:32 | Re: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns |
Previous Message | tanghy.fnst@fujitsu.com | 2022-07-20 07:03:47 | RE: Memory leak fix in psql |