From: | s(dot)p(dot)e(at)gmx-topmail(dot)de |
---|---|
To: | pgsql-bugs(at)lists(dot)postgresql(dot)org |
Subject: | Postgress 13.x: wrong result for delete with subquery |
Date: | 2021-01-28 12:08:21 |
Message-ID: | trinity-1c565d44-159f-488b-a518-caf13883134f-1611835701633@3c-app-gmx-bap78 |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-bugs |
Hallo PSQL-Team,
I found a dataset (attachment bug13.csv) which produces a wrong result on postgreSQL 13 for a special delete command:
Table t1 and t2 contain the same dataset, every row of the data (id1, id2) is unique
So the command
DELETE FROM t1 WHERE (id1,id2) IN (SELECT t2.id1, t2.id2 FROM t2);
should remove all lines. So it does in postgres 10 and 12.5. In Version 13.0 and 13.1 there is on line left after the command.
In detail:
I have tested on PostgreSQL 10 (local instalaltion on SuSE-Linux) and three docker-Images (12.5, 13.0, 13.1).
The docker container are installed from scratch
docker run --name psqlXX \
--shm-size=880MB \
-v $PWD/dXX:/var/lib/postgresql/data \
-v $PWD/extra:/var/extra \
-e POSTGRES_PASSWORD=comirnaty \
-d postgres:XX.X
docker exec -it psqlXX bash
psql -U postgres < /var/extra/bug13.sql > /var/extra/XXX.out
bug13.sql does the following:
-- wrong result on postgres 13.1 (docker)
-- also tested postgres 13.0 (docker): same bug
-- also tested postgres 12.5 (docker): no bug
\set VERBOSITY verbose
SELECT version();
-- prepare the test-data-set, all rows are unique
DROP TABLE IF EXISTS tstdata;
CREATE TABLE tstdata (
id1 VARCHAR( 3),
id2 VARCHAR( 25)
);
CREATE UNIQUE INDEX tst_idx ON tstdata (id1, id2);
COPY tstdata (id1, id2) FROM '/var/extra/bug13.csv';
-- tables for the test
DROP TABLE IF EXISTS t1, t2;
CREATE TEMPORARY TABLE t1 (
id1 VARCHAR( 32),
id2 VARCHAR( 32)
);
CREATE UNIQUE INDEX t1_idx ON t1 (id1,id2);
CREATE TEMPORARY TABLE t2 (
id1 VARCHAR( 32),
id2 VARCHAR( 32)
);
-- fill the tst tables with data
INSERT INTO t1 SELECT id1,id2 from tstdata;
INSERT INTO t2 SELECT id1,id2 from tstdata;
-- here ist the test for the bug with wrong result
-- t1 and t2 contain the identical set
-- so this command should remove all entries in t1
EXPLAIN DELETE FROM t1 WHERE (id1,id2) IN (SELECT id1,id2 FROM t2);
-- QUERY PLAN
-- ----------------------------------------------------------------------------------------------
-- Delete on t1 (cost=595.12..1201.93 rows=3602 width=12)
-- -> Hash Join (cost=595.12..1201.93 rows=3602 width=12)
-- Hash Cond: (((t1.id1)::text = (t2.id1)::text) AND ((t1.id2)::text = (t2.id2)::text))
-- -> Seq Scan on t1 (cost=0.00..487.06 rows=14406 width=170)
-- -> Hash (cost=573.50..573.50 rows=1441 width=170)
-- -> HashAggregate (cost=559.09..573.50 rows=1441 width=170)
-- Group Key: (t2.id1)::text, (t2.id2)::text
-- -> Seq Scan on t2 (cost=0.00..487.06 rows=14406 width=170)
DELETE FROM t1 WHERE (id1,id2) IN (SELECT id1,id2 FROM t2);
-- on version 13.x there is still one entry left
-- MED | 0000000000000010035567472
SELECT * from t1;
-- sometimes after multiple tries: the bug disappears, after recreating t2 the bug recurs
-- - creating an index on t2 -> correct result
-- - define t2 non temporary -> correct result
-- - analyse t2 -> correct result
-- postgres 12.5 uses the same query plan without any error
-- second service ----------------------------------------------------------------------
-- different query plan after analyse and the bug disappears
-- refill t1
DELETE FROM t1;
INSERT INTO t1 SELECT id1,id2 from tstdata;
ANALYSE t2;
EXPLAIN DELETE FROM t1 WHERE (id1,id2) IN (SELECT id1,id2 FROM t2);
-- QUERY PLAN
-- ----------------------------------------------------------------------------------------------
-- Delete on t1 (cost=1506.92..2718.77 rows=7192 width=12)
-- -> Hash Semi Join (cost=1506.92..2718.77 rows=7192 width=12)
-- Hash Cond: (((t1.id1)::text = (t2.id1)::text) AND ((t1.id2)::text = (t2.id2)::text))
-- -> Seq Scan on t1 (cost=0.00..972.70 rows=28770 width=170)
-- -> Hash (cost=808.57..808.57 rows=46557 width=36)
-- -> Seq Scan on t2 (cost=0.00..808.57 rows=46557 width=36)
DELETE FROM t1 WHERE (id1,id2) IN (SELECT id1,id2 FROM t2);
SELECT * from t1;
Data, Command and Output are attached to this mail.
I reduced the bug-inducing datsaset from initial 8 Mio entries to this example file. The bug is very sensitive to the exact data and disappears on the most changes.
I hope, that a added everything necessary for further analysis.
Best regards
Stephan Endres
Attachment | Content-Type | Size |
---|---|---|
unknown_filename | text/html | 9.5 KB |
125.out | application/octet-stream | 1.8 KB |
130.out | application/octet-stream | 1.9 KB |
131.out | application/octet-stream | 1.9 KB |
bug13.csv | text/csv | 1.3 MB |
bug13.sql | application/sql | 1.9 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | PG Bug reporting form | 2021-01-28 12:19:52 | BUG #16842: pg_dump uses seek calls on pipe files: suggesting adding a flag to disable seek calls |
Previous Message | Ajay Chitale | 2021-01-28 09:36:32 | Re: BUG #16841: psql -- \d tablename , displays "Error : column c.relhasoids does not exit" |