From: | Önder Kalacı <onderkalaci(at)gmail(dot)com> |
---|---|
To: | "kuroda(dot)hayato(at)fujitsu(dot)com" <kuroda(dot)hayato(at)fujitsu(dot)com> |
Cc: | Peter Smith <smithpb2250(at)gmail(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, "wangw(dot)fnst(at)fujitsu(dot)com" <wangw(dot)fnst(at)fujitsu(dot)com> |
Subject: | Re: [PATCH] Use indexes on the subscriber when REPLICA IDENTITY is full on the publisher |
Date: | 2022-09-29 17:08:52 |
Message-ID: | CACawEhVujhCCsrDhNHPLfHr7tWzkj-Ar8q41+uQ+4DFjJKknYw@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hi Hayato Kuroda,
Thanks for the review!
> ====
> 01 relation.c - general
>
> Many files are newly included.
> I was not sure but some codes related with planner may be able to move to
> src/backend/optimizer/plan.
> How do you and any other one think?
>
>
My thinking on those functions is that they should probably stay
in src/backend/replication/logical/relation.c. My main motivation is that
those functions are so much tailored to the purposes of this file that I
cannot see any use-case for these functions in any other context.
Still, at some point, I considered maybe doing something similar
to src/backend/executor/execReplication.c, where I create a new file,
say, src/backend/optimizer/plan/planReplication.c or such as you noted. I'm
a bit torn on this.
Does anyone have any strong opinions for moving to
src/backend/optimizer/plan/planReplication.c? (or another name)
> 02 relation.c - FindLogicalRepUsableIndex
>
> ```
> +/*
> + * Returns an index oid if we can use an index for the apply side. If not,
> + * returns InvalidOid.
> + */
> +static Oid
> +FindLogicalRepUsableIndex(Relation localrel, LogicalRepRelation
> *remoterel)
> ```
>
> I grepped files, but I cannot find the word "apply side". How about
> "subscriber" instead?
>
Yes, it makes sense. I guess I made up the "apply side" as there is the
concept of "apply worker". But, yes, subscribers sound better, updated.
>
> 03 relation.c - FindLogicalRepUsableIndex
>
> ```
> + /* Simple case, we already have an identity or pkey */
> + idxoid = GetRelationIdentityOrPK(localrel);
> + if (OidIsValid(idxoid))
> + return idxoid;
> +
> + /*
> + * If index scans are disabled, use a sequential scan.
> + *
> + * Note that we still allowed index scans above when there is a
> primary
> + * key or unique index replica identity, but that is the legacy
> behaviour
> + * (even when enable_indexscan is false), so we hesitate to move
> this
> + * enable_indexscan check to be done earlier in this function.
> + */
> + if (!enable_indexscan)
> + return InvalidOid;
> ```
>
> a.
> I think "identity or pkey" should be "replica identity key or primary key"
> or "RI or PK"
>
Looking into other places, it seems "replica identity index" is favored
over "replica identity key". So, I used that term.
You can see this pattern in RelationGetReplicaIndex()
>
> b.
> Later part should be at around GetRelationIdentityOrPK.
>
Hmm, I cannot follow this comment. Can you please clarify?
>
>
> 04 relation.c - FindUsableIndexForReplicaIdentityFull
>
> ```
> + MemoryContext usableIndexContext;
> ...
> + usableIndexContext = AllocSetContextCreate(CurrentMemoryContext,
> +
> "usableIndexContext",
> +
> ALLOCSET_DEFAULT_SIZES);
> ```
>
> I grepped other sources, and I found that the name like "tmpcxt" is used
> for the temporary MemoryContext.
>
I think there are also several contextes that are named more specifically,
such as new_pdcxt, perTupCxt, anl_context, cluster_context and many others.
So, I think it is better to have specific names, no?
>
> 05 relation.c - SuitablePathsForRepIdentFull
>
> ```
> + indexRelation = index_open(idxoid,
> AccessShareLock);
> + indexInfo = BuildIndexInfo(indexRelation);
> + is_btree = (indexInfo->ii_Am == BTREE_AM_OID);
> + is_partial = (indexInfo->ii_Predicate != NIL);
> + is_only_on_expression =
> IndexOnlyOnExpression(indexInfo);
> + index_close(indexRelation, NoLock);
> ```
>
> Why the index is closed with NoLock? AccessShareLock is acquired, so
> shouldn't same lock be released?
>
Hmm, yes you are right. Keeping the lock seems unnecessary and wrong. It
could actually have prevented dropping an index. However, given
that RelationFindReplTupleByIndex() also closes this index at the end, the
apply worker releases the lock. Hence, no problem observed.
Anyway, I'm still changing it to releasing the lock.
Also note that as soon as any index is dropped on the relation, the cache
is invalidated and suitable indexes are re-calculated. That's why it seems
fine to release the lock.
>
>
> 06 relation.c - GetCheapestReplicaIdentityFullPath
>
> IIUC a query like "SELECT tbl WHERE attr1 = $1 AND attr2 = $2 ... AND
> attrN = $N" is emulated, right?
> you can write explicitly it as comment
>
>
The inlined comment in the function has a similar comment. Is that clear
enough?
/* * Generate restrictions for all columns in the form of col_1 = $1 AND *
col_2 = $2 ... */
> 07 relation.c - GetCheapestReplicaIdentityFullPath
>
> ```
> + Path *path = (Path *) lfirst(lc);
> + Oid idxoid = GetIndexOidFromPath(path);
> +
> + if (!OidIsValid(idxoid))
> + {
> + /* Unrelated Path, skip */
> + suitableIndexList = lappend(suitableIndexList,
> path);
> + }
> ```
>
> I was not clear about here. IIUC in the function we want to extract "good"
> scan plan and based on that the cheapest one is chosen.
> GetIndexOidFromPath() seems to return InvalidOid when the input path is
> not index scan, so why is it appended to the suitable list?
>
>
It could be a sequential scan that we have fall-back. However, we already
add the sequential scan at the end of the function. So, actually you are
right, there is no need to keep any other paths here. Adjusted the comments.
>
> ===
> 08 worker.c - usable_indexoid_internal
>
> I think this is not "internal" function, such name should be used for like
> "apply_handle_commit" - "apply_handle_commit_internal", or
> "apply_handle_insert" - "apply_handle_insert_internal".
> How about "get_usable_index" or something?
>
Yeah, you are right. I use this function inside functions ending with
_internal, but this one is clearly not an internal function. I
used get_usable_indexoid().
>
> 09 worker.c - usable_indexoid_internal
>
> ```
> + Oid targetrelid =
> targetResultRelInfo->ri_RelationDesc->rd_rel->oid;
> + Oid localrelid =
> relinfo->ri_RelationDesc->rd_id;
> +
> + if (targetrelid != localrelid)
> ```
>
> I think these lines are very confusable.
> IIUC targetrelid is corresponded to the "parent", and localrelid is
> corresponded to the "child", right?
> How about changing name to "partitionedoid" and "leadoid" or something?
>
We do not know whether targetrelid is definitely a "parent". But, if that
is a parent, this function fetches the relevant partition's usableIndexOid.
So, I'm not convinced that "parent" is a good choice.
Though, I agree that we can improve the code a bit. I now
use targetrelkind and dropped localrelid to check whether the target is a
partitioned table. Is this better?
>
> ===
> 10 032_subscribe_use_index.pl
>
> ```
> # create tables pub and sub
> $node_publisher->safe_psql('postgres',
> "CREATE TABLE test_replica_id_full (x int)");
> $node_publisher->safe_psql('postgres',
> "ALTER TABLE test_replica_id_full REPLICA IDENTITY FULL;");
> $node_subscriber->safe_psql('postgres',
> "CREATE TABLE test_replica_id_full (x int)");
> $node_subscriber->safe_psql('postgres',
> "CREATE INDEX test_replica_id_full_idx ON
> test_replica_id_full(x)");
> ```
>
> In many places same table is defined, altered as "REPLICA IDENTITY FULL",
> and index is created.
> Could you combine them into function?
>
Well, I'm not sure if it is worth the complexity. There are only 4 usages
of the same table, and these are all pretty simple statements, and all
other tests seem to have a similar pattern. I have not seen any tests where
these simple statements are done in a function even if they are repeated.
I'd rather keep it so that this doesn't lead to other style discussions?
>
> 11 032_subscribe_use_index.pl
>
> ```
> # wait until the index is used on the subscriber
> $node_subscriber->poll_query_until(
> 'postgres', q{select (idx_scan = 1) from pg_stat_all_indexes where
> indexrelname = 'test_replica_id_full_idx';}
> ) or die "Timed out while waiting for check subscriber tap_sub_rep_full_0
> updates one row via index";
> ```
>
> In many places this check is done. Could you combine them into function?
>
I'm a little confused. Isn't that already inside a function (e.g.,
poll_query_until) ? Can you please clarify this suggestion a bit more?
>
> 12 032_subscribe_use_index.pl
>
> ```
> # create pub/sub
> $node_publisher->safe_psql('postgres',
> "CREATE PUBLICATION tap_pub_rep_full FOR TABLE
> test_replica_id_full");
> $node_subscriber->safe_psql('postgres',
> "CREATE SUBSCRIPTION tap_sub_rep_full CONNECTION
> '$publisher_connstr application_name=$appname' PUBLICATION tap_pub_rep_full"
> );
> ```
>
> Same as above
>
Well, again, I'm not sure if it is worth moving these simple statements to
functions as an improvement here. One might tell that it is better to see
the statements explicitly on the test -- which almost all the tests do. I
want to avoid introducing some unusual pattern to the tests.
>
> 13 032_subscribe_use_index.pl
>
> ```
> # cleanup pub
> $node_publisher->safe_psql('postgres', "DROP PUBLICATION
> tap_pub_rep_full");
> $node_publisher->safe_psql('postgres', "DROP TABLE test_replica_id_full");
> # cleanup sub
> $node_subscriber->safe_psql('postgres', "DROP SUBSCRIPTION
> tap_sub_rep_full");
> $node_subscriber->safe_psql('postgres', "DROP TABLE test_replica_id_full");
> ```
>
> Same as above
>
Same as above :)
>
> 14 032_subscribe_use_index.pl - SUBSCRIPTION USES INDEX
>
> ```
> # make sure that the subscriber has the correct data
> my $result = $node_subscriber->safe_psql('postgres',
> "SELECT sum(x) FROM test_replica_id_full");
> is($result, qq(212), 'ensure subscriber has the correct data at the end of
> the test');
>
> $node_subscriber->poll_query_until(
> 'postgres', q{select sum(x)=212 AND count(*)=21 AND count(DISTINCT
> x)=20 FROM test_replica_id_full;}
> ) or die "ensure subscriber has the correct data at the end of the test";
> ```
>
> I think first one is not needed.
>
I preferred to keep the second one because *is($result, ..* is needed for
tests to show the progress while running.
>
>
> 15 032_subscribe_use_index.pl - SUBSCRIPTION USES INDEX UPDATEs MULTIPLE
> ROWS
>
> ```
> # insert some initial data within the range 0-20
> $node_publisher->safe_psql('postgres',
> "INSERT INTO test_replica_id_full SELECT i%20 FROM
> generate_series(0,1000)i;"
> );
> ```
>
> I think data is within the range 0-19.
> (There are some mistakes)
>
Yes, I fixed it all.
>
> ===
> 16 test/subscription/meson.build
>
> Your test 't/032_subscribe_use_index.pl' must be added in the 'tests' for
> meson build system.
> (I checked on my env, and your test works well)
>
>
Oh, I didn't know about this, thanks!
Attached v14.
Thanks,
Onder
Attachment | Content-Type | Size |
---|---|---|
v14_0001_use_index_on_subs_when_pub_rep_ident_full.patch | application/octet-stream | 79.6 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Önder Kalacı | 2022-09-29 17:14:06 | Re: A potential memory leak on Merge Join when Sort node is not below Materialize node |
Previous Message | Bharath Rupireddy | 2022-09-29 17:08:33 | Re: Avoid memory leaks during base backups |