From: | Justin Pryzby <pryzby(at)telsasoft(dot)com> |
---|---|
To: | Peter Geoghegan <pg(at)bowt(dot)ie> |
Cc: | Kenneth Marshall <ktm(at)rice(dot)edu>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: unique index violation after pg_upgrade to PG10 |
Date: | 2017-10-25 05:20:39 |
Message-ID: | 20171025052039.GY21735@telsasoft.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Tue, Oct 24, 2017 at 02:57:47PM -0700, Peter Geoghegan wrote:
> On Tue, Oct 24, 2017 at 1:11 PM, Justin Pryzby <pryzby(at)telsasoft(dot)com> wrote:
> > ..which I gather just verifies that the index is corrupt, not sure if there's
> > anything else to do with it? Note, we've already removed the duplicate rows.
>
> Yes, the index itself is definitely corrupt -- this failed before the
> new "heapallindexed" check even started. Though it looks like that
> would have failed too, if you got that far, since the index points to
> a row that does not contain the same data. (I only know this because
> you dumped the heap tuple and the index tuple.)
I think you must have compared these:
On Tue, Oct 24, 2017 at 03:11:44PM -0500, Justin Pryzby wrote:
> ts=# SELECT * FROM bt_page_items(get_raw_page('sites_idx', 1));
>
> itemoffset | 48
> ctid | (1,37)
> itemlen | 32
> nulls | f
> vars | t
> data | 1b 43 52 43 4c 4d 54 2d 43 45 4d 53 30 0b 31 31 31 31 00 00 00 00 00 00
...
> itemoffset | 37
> ctid | (0,97)
> itemlen | 24
> nulls | f
> vars | t
> data | 1b 43 52 43 4c 4d 54 2d 43 45 4d 53 30 03 00 00
..but note those are both items in sites_idx (48 and 37, which I seem to have
pasted out of order).. I included both because I'm not confident I know what
the "index tid=(1,37)" referred to, but I gather it means item at offset=37
(and not item with ctid=(1,37).)
| [pryzbyj(at)database amcheck]$ psql --port 5678 ts -c "SELECT bt_index_check('sites_idx'::regclass::oid, heapallindexed=>True)"
| ERROR: high key invariant violated for index "sites_idx"
| DETAIL: Index tid=(1,37) points to heap tid=(0,97) page lsn=0/0.
ts=# SELECT * FROM page_header(get_raw_page('sites_idx', 1));
lsn | checksum | flags | lower | upper | special | pagesize | version | prune_xid
-----+----------+-------+-------+-------+---------+----------+---------+-----------
0/0 | 0 | 0 | 872 | 1696 | 8176 | 8192 | 4 | 0
Here is its heap page:
ts=# SELECT * FROM heap_page_items(get_raw_page('sites', 0)) WHERE lp=97;
lp | lp_off | lp_flags | lp_len | t_xmin | t_xmax | t_field3 | t_ctid | t_infomask2 | t_infomask | t_hoff | t_bits | t_oid | t_data
----+--------+----------+--------+--------+----------+----------+--------+-------------+------------+--------+------------------+-------+--------------------------------------------
97 | 968 | 1 | 52 | 21269 | 33567444 | 0 | (3,27) | 8204 | 2307 | 32 | 1110100000010000 | | \x700000001b4352434c4d542d43454d5330030303
Which I see ends with 0303 vs 0000..
t_infomask=2307=2048+256+3 =>
#define HEAP_HASNULL 0x0001 /* has null attribute(s) */
#define HEAP_HASVARWIDTH 0x0002 /* has variable-width attribute(s) */
#define HEAP_XMIN_COMMITTED 0x0100 /* t_xmin committed */
#define HEAP_XMAX_INVALID 0x0800 /* t_xmax invalid/aborted */
t_infomask2=8204 => 8192+12 =>
#define HEAP_KEYS_UPDATED 0x2000 /* tuple was updated and key cols modified, or tuple deleted */
Maybe this is relevant ?
ts=# SELECT * FROM heap_page_items(get_raw_page('sites', 3)) WHERE lp=27;
lp | lp_off | lp_flags | lp_len | t_xmin | t_xmax | t_field3 | t_ctid | t_infomask2 | t_infomask | t_hoff | t_bits | t_oid | t_data
----+--------+----------+--------+--------+--------+----------+--------+-------------+------------+--------+--------+-------+--------
27 | 0 | 0 | 0 | | | | | | | | | |
Justin
From | Date | Subject | |
---|---|---|---|
Next Message | Robert Haas | 2017-10-25 05:33:46 | Re: Current int & float overflow checking is slow. |
Previous Message | Michael Paquier | 2017-10-25 04:36:13 | Re: Remove secondary checkpoint |