From: | Petr Jelinek <petr(dot)jelinek(at)2ndquadrant(dot)com> |
---|---|
To: | Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Logical replication existing data copy |
Date: | 2017-01-18 13:46:56 |
Message-ID: | 5a90e2ff-6375-2e30-d0af-f6d1f672429a@2ndquadrant.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On 11/01/17 17:10, Peter Eisentraut wrote:
> On 12/19/16 4:30 AM, Petr Jelinek wrote:
>> This patch implements data synchronization for the logical replication.
>> It works both for initial setup of subscription as well as for tables
>> added later to replication when the subscription is already active.
>
> First detailed read-through. General structure makes sense.
Thanks!
>
> Comments on some details:
>
> No catalogs.sgml documentation for pg_subscription_rel. When that is
> added, I would emphasize that entries are only added when relations
> are first encountered, not immediately when a table is added to a
> publication. So it is unlike pg_publication_rel in that respect.
>
It's not even first encountered, but I did explain.
> Rename max_subscription_sync_workers ->
> max_sync_workers_per_subscription (similar to
> max_parallel_workers_per_gather and others)
>
Makes sense.
> About the changes to COPY: It took me a while to get clarity on the
> direction of things. Is the read_cb reading the table, or the data?
> You are copying data produced by a function into a table, so
> produce_cb or data_source_cb could be clearer.
>
The data_source_cb sounds good to me.
> SetSubscriptionRelState(): This is doing an "upsert", so I don't think
> a RowExclusiveLock is enough, or it needs to be able to retry on
> concurrent changes. I suppose there shouldn't be more than one
> concurrent change per sub/rel pair, but that would need to be
> explained there.
>
Well technically there can be some concurrency via the ALTER
SUBSCRIPTION ... REFRESH so I increased lock level (and same for Remove).
> SetSubscriptionRelState(): The memset(values, 0, sizeof(values)) is
> kind of in an odd place. Possibly not actually needed.
>
It might not be needed but we traditionally do it anyway. I moved it a bit.
> GetSubscriptionRelState(): Prefer error messages that identify the
> objects by name. (Also subid should be %u.)
>
Well this is cache lookup failure though, who's to know that the objects
actually exist in that case.
> GetSubscriptionRelationsFilter(): The state and filter arguments are
> kind of weird. And there are only two callers, so all this complexity
> is not even used. Perhaps, if state == SUBREL_STATE_UNKNOWN, then
> return everything, else return exactly the state specified. The case
> of everything-but-the-state-specified does not appear to be needed.
>
I see this was bit confusing so I split the function into two. Actually
the 2 usecases used are everything and everything except SUBREL_STATE_READY.
> GetSubscriptionRelationsFilter(): RowExclusiveLock is probably too
> much
>
Yes, same with GetSubscriptionRelState().
> AlterSubscription_refresh(): remote_table_oids isn't really the
> remote OIDs of the tables but the local OIDs of the remote tables.
> Consider clearer variable naming in that function.
>
Done.
> interesting_relation(): very generic name, maybe
> should_apply_changes_for_rel(). Also the comment mentions a "parallel
> worker" -- maybe better a "separate worker process running in
> parallel", since a parallel worker is something different.
>
Done.
>
> process_syncing_tables_*(): Function names and associated comments are
> not very clear (process what?, handle what?).
>
Ok added some more explanation, hopefully it's better now.
> In process_syncing_tables_apply(), it says that
> logicalrep_worker_count() counts the apply worker as well, but what
> happens if the apply worker has temporarily disappeared? Since
Then the function is never going to be called for that subscription as
it's only called from the apply.
> logicalrep_worker_count() is only used in this one place, maybe have
> it count just the sync workers and rename it slightly.
>
Makes sense anyway though.
> Changed some LOG messages to DEBUG, not clear what the purpose there is.
In fact I changed some DEBUG messages to LOG in the main patch set, git
rebase just does not handle that very well. Fixed.
> check_max_subscription_sync_workers(): Same problem as discussed
> previously, checking a GUC setting against another GUC setting like
> this doesn't work. Just skip it and check at run time.
>
Yeah, we always check at run time.
> wait_for_sync_status_change(): Comment is wrong/misleading: It doesn't
> wait until it matches any specific state, it just waits for any state
> change.
>
Fixed.
> LogicalRepSyncTableStart(): The code that assembles the slot name
> needs to be aware of NAMEDATALEN. (Maybe at least throw in a static
> assert that NAMEDATALEN>=64.)
>
I switched to snprintf seems cleaner (also removes the need to know
about proper memory context of a private variable from
LogicalRepSyncTableStart()).
I also added option called SKIP CONNECT to CREATE SUBSCRIPTION (yes that
WIP name, I welcome suggestions). That skips the initial connection
attempt which is now needed always since we need to copy the table list.
--
Petr Jelinek http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
Attachment | Content-Type | Size |
---|---|---|
0001-Logical-replication-support-for-initial-data-copy-v2.patch | text/plain | 147.0 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Amit Kapila | 2017-01-18 14:18:24 | Re: Parallel Index Scans |
Previous Message | Tom Lane | 2017-01-18 13:43:24 | Re: Implement targetlist SRFs using ROWS FROM() (was Changed SRF in targetlist handling) |