B-tree Index corruption

From: Filip Sedlák <filip(at)sedlakovi(dot)org>
To: pgsql-bugs(at)lists(dot)postgresql(dot)org
Cc: Michal Danilák <michal(at)monitora(dot)cz>
Subject: B-tree Index corruption
Date: 2023-11-20 15:08:39
Message-ID: 8cf27c03-941e-4b12-9bdc-06237c0302d4@sedlakovi.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

Hi,
I see a weird behaviour which might be a bug in Postgres but I wasn't
able to find it online.

We have a simple table with a UNIQUE constraint that doesn't work. I'm
able to insert one row with a conflicting value (while two instances of
this value are already in the table). The next insert fails. When using
the underlying index for lookups, the old values are not retrieved
(consistent with the allowed insert).

We're on version 15.3 and I see some index misbehaviour fixed in 15.5
but nothing on btree over text values. I saw some posts about glibc
upgrade changing collation rules but I don't think it's our case because
we see it with plain ASCII values. More importantly, we last upgraded
the major OS version in June but I was able to find a row missing from
the index that was inserted a few weeks ago.

I don't have a self-contained repro script. I can easily test anything
on our snapshot. But I don't know where to look further. I'd be happy
for guidance.

See below the psql session demonstrating the problem.

Best regards
Filip

monitora_snapshot=# SELECT version();
version

---------------------------------------------------------------------------------------------------------------------
PostgreSQL 15.3 (Debian 15.3-1.pgdg120+1) on x86_64-pc-linux-gnu,
compiled by gcc (Debian 12.2.0-14) 12.2.0, 64-bit
(1 row)

monitora_snapshot=# \d social_hashtag

Table "public.social_hashtag"
Column | Type | Collation | Nullable |
Default
---------+--------------------------+-----------+----------+--------------------------------------------
id | integer | | not null |
nextval('social_hashtag_id_seq'::regclass)
hashtag | character varying(50) | | not null |
created | timestamp with time zone | | not null |
Indexes:
"social_hashtag_pkey" PRIMARY KEY, btree (id)
"social_hashtag_hashtag_eac29120_like" btree (hashtag
varchar_pattern_ops)
"social_hashtag_hashtag_key" UNIQUE CONSTRAINT, btree (hashtag)

monitora_snapshot=# EXPLAIN SELECT id, hashtag
FROM social_hashtag
WHERE hashtag = '________________';
QUERY PLAN

------------------------------------------------------------------------------------------------------------
Index Scan using social_hashtag_hashtag_eac29120_like on
social_hashtag (cost=0.43..2.65 rows=1 width=18)
Index Cond: ((hashtag)::text = '________________'::text)
(2 rows)

monitora_snapshot=#
monitora_snapshot=# -- OK, so the select uses the other index. It shows
monitora_snapshot=# -- two rows with the "unique" hashtag.
monitora_snapshot=#
monitora_snapshot=# SELECT id, hashtag
FROM social_hashtag
WHERE hashtag = '________________';
id | hashtag
---------+------------------
1565500 | ________________
6329472 | ________________
(2 rows)

monitora_snapshot=# -- Still, we can insert
monitora_snapshot=# INSERT INTO social_hashtag
VALUES (DEFAULT, '________________', NOW());
INSERT 0 1

monitora_snapshot=# -- And retrieve the data
monitora_snapshot=#
monitora_snapshot=# SELECT id, hashtag
FROM social_hashtag
WHERE hashtag = '________________';
id | hashtag
---------+------------------
1565500 | ________________
6329472 | ________________
6338012 | ________________
(3 rows)

monitora_snapshot=#
monitora_snapshot=# -- But we can't insert for the second time.
monitora_snapshot=#
monitora_snapshot=# INSERT INTO social_hashtag VALUES (DEFAULT,
'________________', NOW());
ERROR: duplicate key value violates unique constraint
"social_hashtag_hashtag_key"
DETAIL: Key (hashtag)=(________________) already exists.

monitora_snapshot=# DROP INDEX social_hashtag_hashtag_eac29120_like ;
DROP INDEX

monitora_snapshot=# EXPLAIN SELECT id, hashtag
FROM social_hashtag
WHERE hashtag = '________________';
QUERY PLAN

--------------------------------------------------------------------------------------------------
Index Scan using social_hashtag_hashtag_key on social_hashtag
(cost=0.43..2.65 rows=1 width=18)
Index Cond: ((hashtag)::text = '________________'::text)
(2 rows)

monitora_snapshot=#
monitora_snapshot=# -- Dropping the other index means the select now
monitora_snapshot=# -- uses the index that's used for the constraint.
monitora_snapshot=# -- It sees only the last row.
monitora_snapshot=#
id | hashtag
---------+------------------
6337933 | ________________

(1 row)

--
Filip Sedlák

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Tom Lane 2023-11-20 15:10:44 Re: BUG #18208: ts_headline MinWords and MaxWords cannot be set to the same value
Previous Message PG Bug reporting form 2023-11-20 14:04:26 BUG #18208: ts_headline MinWords and MaxWords cannot be set to the same value