From: | PG Bug reporting form <noreply(at)postgresql(dot)org> |
---|---|
To: | pgsql-bugs(at)lists(dot)postgresql(dot)org |
Cc: | shaun(at)tembo(dot)io |
Subject: | BUG #18556: Parallel operation error during CREATE TABLE AS statement |
Date: | 2024-07-26 19:07:52 |
Message-ID: | 18556-06f52f915c19f7c0@postgresql.org |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-bugs |
The following bug has been logged on the website:
Bug reference: 18556
Logged by: Shaun Thomas
Email address: shaun(at)tembo(dot)io
PostgreSQL version: 16.3
Operating system: Ubuntu 22.04.4 LTS
Description:
We have been witnessing an intermittent error from a Postgres 16.3 instance
if a table is created using a SELECT statement. The error is:
ERROR: cannot start commands during a parallel operation
The statement itself is very innocuous:
CREATE TEMP TABLE parallel_bonk AS (
SELECT
document_key,
full_document
FROM some_schema.source_table
WHERE document_key ->> '_id' IN ('xyz', 'pdq')
);
Just to eliminate some potential variables, this is a 54-million row HEAP
table with the following definition:
Column | Type | Collation |
Nullable | Default
-----------------------------------+--------------------------+-----------+----------+---------
_id | jsonb | |
not null |
operation_type | text | |
|
full_document | jsonb | |
|
ns_db | text | |
|
ns_coll | text | |
|
to_db | text | |
|
to_coll | text | |
|
document_key | jsonb | |
not null |
update_description_updated_fields | jsonb | |
|
update_description_removed_fields | text[] | |
|
cluster_time | jsonb | |
|
txn_number | bigint | |
|
lsid_id | text | |
|
lsid_uid | text | |
|
kafka_topic | text | |
not null |
kafka_partition | integer | |
not null |
kafka_offset | bigint | |
not null |
kafka_timestamp | timestamp with time zone | |
not null |
Indexes:
"source_table_pkey" PRIMARY KEY, btree (document_key)
"source_table_kafka_offset_idx" UNIQUE, btree (kafka_offset DESC)
Note that there are no triggers, rules, constraints, partitions, or any
parallel-unsafe entities associated with this table. The associated query
plan looks like this:
QUERY PLAN
--------------------------------------------------------------------------------------------
Gather (cost=1000.00..4045627.93 rows=544751 width=479)
Workers Planned: 8
-> Parallel Seq Scan on source_table (cost=0.00..3990152.83 rows=68094
width=479)
Filter: ((document_key ->> '_id'::text) = ANY
('{xyz,pdq}'::text[]))
JIT:
Functions: 4
Options: Inlining true, Optimization true, Expressions true, Deforming
true
Note that the problem persists regardless of the JIT setting. This appears
to be related to the parallel workers in some manner. We haven't been able
to distill this down to a simple or reliable repro case yet, as the problem
seems to occur entirely at random, suggesting it's some kind of parallel
operation race condition.
There have been other similar discussions regarding the dbt Postgres ETL
toolkit. It appears this ETL kit utilizes a loading procedure that is more
likely to trigger this edge case.
From | Date | Subject | |
---|---|---|---|
Next Message | PG Bug reporting form | 2024-07-26 21:06:51 | BUG #18557: Compatibility issue |
Previous Message | Zaid Shabbir | 2024-07-26 17:52:57 | Re: BUG #18554: pg_verifybackup is not work |