Re: BUG #18641: Logical decoding of two-phase commit fails with TOASTed default values

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Takeshi Ideriha <iderihatakeshi(at)gmail(dot)com>
Cc: exclusion(at)gmail(dot)com, pgsql-bugs(at)lists(dot)postgresql(dot)org
Subject: Re: BUG #18641: Logical decoding of two-phase commit fails with TOASTed default values
Date: 2024-10-05 10:20:05
Message-ID: CAA4eK1J-8FQjSsjUPYk-8yqx4GJPQUmj6-aKz_xWm4NARkgwyg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs pgsql-hackers

On Tue, Oct 1, 2024 at 7:07 PM Takeshi Ideriha <iderihatakeshi(at)gmail(dot)com> wrote:
>

Sending it again to avoid getting it stuck because the original was
replied to both -hackers and -bugs.

>
> >We forgot to set/unset the flag in functions
> >systable_beginscan_ordered and systable_endscan_ordered. BTW,
>
> Thank you for the clarification.
>
> >shouldn't this occur even without prepare transaction? If so, we need
> >to backpatch this till 14.
>
> Yes, it occurred also at PG14.
> I did some tests using 015_stream.pl with some modification like
> below, which tests the subscription about stream is on but two-phase
> is off.
> The same issue occurred at both current head source code and PG14.
>
> ```
> --- a/src/test/subscription/t/015_stream.pl
> +++ b/src/test/subscription/t/015_stream.pl
> @@ -134,9 +134,11 @@ my $node_subscriber =
> PostgreSQL::Test::Cluster->new('subscriber');
> $node_subscriber->init;
> $node_subscriber->start;
>
> +my $default = join('', map {chr(65 + rand 26)} (1 .. 10000));
> +
> # Create some preexisting content on publisher
> $node_publisher->safe_psql('postgres',
> - "CREATE TABLE test_tab (a int primary key, b bytea)");
> + "CREATE TABLE test_tab (a int primary key, b bytea, t text
> DEFAULT '$default')");
> $node_publisher->safe_psql('postgres',
> "INSERT INTO test_tab VALUES (1, 'foo'), (2, 'bar')");
>
> @@ -144,7 +146,7 @@ $node_publisher->safe_psql('postgres', "CREATE
> TABLE test_tab_2 (a int)");
>
> # Setup structure on subscriber
> $node_subscriber->safe_psql('postgres',
> - "CREATE TABLE test_tab (a int primary key, b bytea, c
> timestamptz DEFAULT now(), d bigint DEFAULT 999)"
> + "CREATE TABLE test_tab (a int primary key, b bytea, t text
> DEFAULT '$default', c timestamptz DEFAULT now(), d bigint DEFAULT
> 999)"
> );
> ```
>
> Logs from head source:
> ```
> 2024-10-01 12:34:56.694 UTC [575202] LOG: starting PostgreSQL 18devel
> on x86_64-pc-linux-gnu, compiled by gcc (GCC) 11.4.1 20230605 (Red Hat
> 11.4.1-2), 64-bit
> ...
> 2024-10-01 12:34:57.506 UTC [575258] 015_stream.pl LOG: statement: BEGIN;
> 2024-10-01 12:34:57.506 UTC [575258] 015_stream.pl LOG: statement:
> INSERT INTO test_tab SELECT i, sha256(i::text::bytea) FROM
> generate_series(3, 5000) s(i);
> 2024-10-01 12:34:57.524 UTC [575242] tap_sub ERROR: unexpected
> table_index_fetch_tuple call during logical decoding
> 2024-10-01 12:34:57.524 UTC [575242] tap_sub STATEMENT:
> START_REPLICATION SLOT "tap_sub" LOGICAL 0/0 (proto_version '4',
> streaming 'on', origin 'any', publication_names '"tap_pub"')
> 2024-10-01 12:34:57.525 UTC [575242] tap_sub LOG: released logical
> replication slot "tap_sub"
> 2024-10-01 12:34:57.525 UTC [575242] tap_sub LOG: could not send data
> to client: Broken pipe
> 2024-10-01 12:34:57.525 UTC [575242] tap_sub FATAL: connection to client lost
> ...
> 2024-10-01 12:34:57.829 UTC [575260] tap_sub STATEMENT:
> START_REPLICATION SLOT "tap_sub" LOGICAL 0/0 (proto_version '4',
> streaming 'on', origin 'any', publication_names '"tap_pub"')
> 2024-10-01 12:34:57.829 UTC [575260] tap_sub LOG: starting logical
> decoding for slot "tap_sub"
> 2024-10-01 12:34:57.829 UTC [575260] tap_sub DETAIL: Streaming
> transactions committing after 0/1583A68, reading WAL from 0/1583A30.
> 2024-10-01 12:34:57.829 UTC [575260] tap_sub STATEMENT:
> START_REPLICATION SLOT "tap_sub" LOGICAL 0/0 (proto_version '4',
> streaming 'on', origin 'any', publication_names '"tap_pub"')
> 2024-10-01 12:34:57.829 UTC [575260] tap_sub LOG: logical decoding
> found consistent point at 0/1583A30
> 2024-10-01 12:34:57.829 UTC [575260] tap_sub DETAIL: There are no
> running transactions.
> 2024-10-01 12:34:57.829 UTC [575260] tap_sub STATEMENT:
> START_REPLICATION SLOT "tap_sub" LOGICAL 0/0 (proto_version '4',
> streaming 'on', origin 'any', publication_names '"tap_pub"')
> 2024-10-01 12:34:57.831 UTC [575260] tap_sub ERROR: unexpected
> table_index_fetch_tuple call during logical decoding
> 2024-10-01 12:34:57.831 UTC [575260] tap_sub STATEMENT:
> START_REPLICATION SLOT "tap_sub" LOGICAL 0/0 (proto_version '4',
> streaming 'on', origin 'any', publication_names '"tap_pub"')
> 2024-10-01 12:34:57.832 UTC [575260] tap_sub LOG: released logical
> replication slot "tap_sub"
> 2024-10-01 12:34:57.832 UTC [575260] tap_sub LOG: could not send data
> to client: Broken pipe
> 2024-10-01 12:34:57.832 UTC [575260] tap_sub FATAL: connection to client lost
> ```
>
> Logs from PG14 source:
>
> ```
> 2024-10-01 13:20:29.409 UTC [593696] LOG: starting PostgreSQL 14.9 on
> x86_64-pc-linux-gnu, compiled by gcc (GCC) 11.4.1 20230605 (Red Hat
> 11.4.1-2), 64-bit
> ...
> 2024-10-01 13:20:31.285 UTC [593750] 015_stream.pl LOG: statement: BEGIN;
> 2024-10-01 13:20:31.285 UTC [593750] 015_stream.pl LOG: statement:
> INSERT INTO test_tab SELECT i, md5(i::text) FROM generate_series(3,
> 5000) s(i);
> 2024-10-01 13:20:31.301 UTC [593740] tap_sub ERROR: unexpected
> table_index_fetch_tuple call during logical decoding
> 2024-10-01 13:20:31.301 UTC [593740] tap_sub STATEMENT:
> START_REPLICATION SLOT "tap_sub" LOGICAL 0/0 (proto_version '2',
> streaming 'on', publication_names '"tap_pub"')
> 2024-10-01 13:20:31.302 UTC [593740] tap_sub LOG: could not send data
> to client: Broken pipe
> 2024-10-01 13:20:31.302 UTC [593740] tap_sub FATAL: connection to client lost
> ...
> 2024-10-01 13:20:31.830 UTC [593756] tap_sub STATEMENT:
> START_REPLICATION SLOT "tap_sub" LOGICAL 0/0 (proto_version '2',
> streaming 'on', publication_names '"tap_pub"')
> 2024-10-01 13:20:31.830 UTC [593756] tap_sub LOG: logical decoding
> found consistent point at 0/172C758
> 2024-10-01 13:20:31.830 UTC [593756] tap_sub DETAIL: There are no
> running transactions.
> 2024-10-01 13:20:31.830 UTC [593756] tap_sub STATEMENT:
> START_REPLICATION SLOT "tap_sub" LOGICAL 0/0 (proto_version '2',
> streaming 'on', publication_names '"tap_pub"')
> 2024-10-01 13:20:31.834 UTC [593756] tap_sub ERROR: unexpected
> table_index_fetch_tuple call during logical decoding
> 2024-10-01 13:20:31.834 UTC [593756] tap_sub STATEMENT:
> START_REPLICATION SLOT "tap_sub" LOGICAL 0/0 (proto_version '2',
> streaming 'on', publication_names '"tap_pub"')
> 2024-10-01 13:20:31.835 UTC [593756] tap_sub LOG: could not send data
> to client: Broken pipe
> 2024-10-01 13:20:31.835 UTC [593756] tap_sub FATAL: connection to client lost
> ```
>
> >Also, it is better to have a test for this, and let's ensure that the
> >new test doesn't increase the regression time too much if possible.
>
> Sure. I'm going to add some test codes in a few days.
>

The above test shown in the email will work to test this issue.
However, we should use 'debug_logical_replication_streaming' as that
will help to reproduce the issue with just one write operation. Is it
possible to add the test in 018_stream_subxact_abort where we already
use it?

--
With Regards,
Amit Kapila.

In response to

Browse pgsql-bugs by date

  From Date Subject
Next Message Tom Lane 2024-10-05 18:48:05 Re: BUG #18637: CREATE INDEX won't look up operator classes in search_path if PARTITION BY is specified
Previous Message Amit Kapila 2024-10-05 10:12:14 Re: BUG #18641: Logical decoding of two-phase commit fails with TOASTed default values

Browse pgsql-hackers by date

  From Date Subject
Next Message Richard Guo 2024-10-05 10:23:02 Re: Eager aggregation, take 3
Previous Message Amit Kapila 2024-10-05 10:16:19 Re: New PostgreSQL Contributors