ERROR: posting list tuple with 2 items cannot be split at offset 17

From: Paul McGarry <paul(at)paulmcgarry(dot)com>
To: pgsql-general(at)lists(dot)postgresql(dot)org
Subject: ERROR: posting list tuple with 2 items cannot be split at offset 17
Date: 2023-02-09 07:53:47
Message-ID: CAPrE0SYkv7=qoyHKivAKzCW9_vFCe8jEph8ixTDA8Dh6ae+mFA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

I have three databases, two of databases where I am experiencing the issue
below.

The first database was created from a dump in Feb 2022 (a few weeks after
the time period for which I seem to have problematic indexes, maybe).
The second database was then cloned from the first (ie filesystem level
copy) soon after.
Since then all databases have undergone a number of minor version upgrades,
including to 13.9 and an OS update last week for the two problem databases
(the other is still on 13.8).

Now, a process which does clears some data > 13 months old is getting an
error when trying to do that update.

My suspicion is that either:
- there was probably an issue with the index 12 months ago and that problem
was copied when I cloned the database, and is just becoming apparent now a
script is accessing 13 month olf data.
- something in our recent upgrade has caused the problem.

The third database is still on 13.8, and with some OS updates pending, and
is not experiencing the problem.

The problem emerges as:

====
UPDATE widget SET description=NULL WHERE (time>='2022-01-07
17:40:05.140287+00'::timestamp - INTERVAL '4 days' AND time<'2022-01-08
17:40:05.780573+00' AND description IS NOT NULL);
ERROR: posting list tuple with 2 items cannot be split at offset 17
====

A select on the same data works fine, so presumably a problem updating the
index, not accessing it or the corresponding table):

====
db=> select count(*) from widget where (time>='2022-01-07
17:40:05.140287+00'::timestamp - INTERVAL '4 days' AND time<'2022-01-08
17:40:05.780573+00' AND description IS NOT NULL);
count
--------
797943
====

The index used as per explain:
====
explain UPDATE widget SET description=NULL WHERE (time>='2022-01-07
17:40:05.140287+00'::timestamp - INTERVAL '4 days' AND time<'2022-01-08
17:40:05.780573+00' AND description IS NOT NULL);

QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------------------------------------
Update on widget (cost=0.57..2921626.80 rows=205910 width=1066)
-> Index Scan using widget_time_client_idx on widget
(cost=0.57..2921626.80 rows=205910 width=1066)
Index Cond: (("time" >= '2022-01-03 17:40:05.140287'::timestamp
without time zone) AND ("time" < '2022-01-08 17:40:05.780573+00'::timestamp
with time zone))
Filter: (description IS NOT NULL)
(4 rows)
====

amcheck attempted on that index, but doesn't seem to identify any issues:

====
db=> SELECT bt_index_check('widget_time_client_idx',true);
DEBUG: verifying consistency of tree structure for index
"widget_time_client_idx"
DEBUG: verifying level 3 (true root level)
DEBUG: verifying level 2
DEBUG: verifying level 1
DEBUG: verifying level 0 (leaf level)
DEBUG: verifying that tuples from index "widget_time_client_idx" are
present in "widget"
DEBUG: finished verifying presence of 639872196 tuples from table "widget"
with bitset 25.94% set
bt_index_check
----------------

(1 row)

db=> SELECT bt_index_parent_check('widget_time_client_idx',true,true);
DEBUG: verifying consistency of tree structure for index
"widget_time_client_idx" with cross-level checks
DEBUG: verifying level 3 (true root level)
DEBUG: verifying level 2
DEBUG: verifying level 1
DEBUG: verifying level 0 (leaf level)

DEBUG: verifying that tuples from index "widget_time_client_idx" are
present in "widget"
DEBUG: finished verifying presence of 639874864 tuples from table "widget"
with bitset 25.94% set
bt_index_parent_check
-----------------------

(1 row)

====

We recreated that index and deleted the old index, and the update then
worked.
I've done that on one of the databases so far.

Despite that working, it then occurred to me that the problem might be due
to problems updating a different index on the same table.

I then found on the two problem DBs (but not the 13.8 one):

====
db=> select bt_index_check('widget_name_idx');
ERROR: item order invariant violated for index "widget_name_idx"
DETAIL: Lower index tid=(682201,85) (points to index tid=(682278,4097))
higher index tid=(682201,86) (points to index tid=(716079,1)) page
lsn=580/E554A858.
====

====
db=> select bt_index_check('widget_name_idx');
ERROR: item order invariant violated for index "widget_name_idx"
DETAIL: Lower index tid=(682201,49) (points to index tid=(682245,1))
higher index tid=(682201,50) (points to index tid=(734398,1)) page
lsn=566/E67C5FF0.
====

which as a text field, seems more likely to be the result of a collation
change problem that might accompany an OS update.

But if it is the problem, why did the update start working after I
recreated the other index?

I think I should now:
- recreate the widget_name_idx on the problem servers
- run bt_index_check across all other indexes

Any suggestions on what else I should look into, in particular anything I
should check before upgrading the remaining 13.8 DB to 13.9?

Thanks for any help,

Paul

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Joseph Kennedy 2023-02-09 08:54:00 Re: PostgreSQL
Previous Message Vladimir Sitnikov 2023-02-09 05:41:46 Re: Boyer-Moore string searching in LIKE, ILIKE, and/or POSITION?