From: | Ashutosh Sharma <ashu(dot)coek88(at)gmail(dot)com> |
---|---|
To: | PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Cc: | Andres Freund <andres(at)anarazel(dot)de>, Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Stephen Frost <sfrost(at)snowman(dot)net>, Robert Haas <robertmhaas(at)gmail(dot)com> |
Subject: | Re: recovering from "found xmin ... from before relfrozenxid ..." |
Date: | 2020-07-24 09:05:08 |
Message-ID: | CAE9k0PnRhymKiyaxUE8u=6_4=K_1ktKjOVb51K_H2eGmL81f1g@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hi All,
Attached is the patch that adds heap_force_kill(regclass, tid[]) and
heap_force_freeze(regclass, tid[]) functions which Robert mentioned in the
first email in this thread. The patch basically adds an extension named
pg_surgery that contains these functions. Please have a look and let me
know your feedback. Thank you.
--
With Regards,
Ashutosh Sharma
EnterpriseDB:http://www.enterprisedb.com
On Thu, Jul 16, 2020 at 9:44 PM Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> On Thu, Jul 16, 2020 at 10:00 AM Robert Haas <robertmhaas(at)gmail(dot)com>
> wrote:
> > I see your point, though: the tuple has to be able to survive
> > HOT-pruning in order to cause a problem when we check whether it needs
> > freezing.
>
> Here's an example where the new sanity checks fail on an invisible
> tuple without any concurrent transactions:
>
> $ initdb
> $ pg_ctl start -l ~/logfile
> $ createdb
> $ psql
>
> create table simpsons (a int, b text);
> vacuum freeze;
>
> $ cat > txid.sql
> select txid_current();
> $ pgbench -t 131072 -c 8 -j 8 -n -f txid.sql
> $ psql
>
> insert into simpsons values (1, 'homer');
>
> $ pg_ctl stop
> $ pg_resetwal -x 1000 $PGDATA
> $ pg_ctl start -l ~/logfile
> $ psql
>
> update pg_class set relfrozenxid = (relfrozenxid::text::integer +
> 2000000)::text::xid where relname = 'simpsons';
>
> rhaas=# select * from simpsons;
> a | b
> ---+---
> (0 rows)
>
> rhaas=# vacuum simpsons;
> ERROR: found xmin 1049082 from before relfrozenxid 2000506
> CONTEXT: while scanning block 0 of relation "public.simpsons"
>
> This is a fairly insane situation, because we should have relfrozenxid
> < tuple xid < xid counter, but instead we have xid counter < tuple xid
> < relfrozenxid, but it demonstrates that it's possible to have a
> database which is sufficiently corrupt that you can't escape from the
> new sanity checks using only INSERT, UPDATE, and DELETE.
>
> Now, an even easier way to create a table with a tuple that prevents
> vacuuming and also can't just be deleted is to simply remove a
> required pg_clog file (and maybe restart the server to clear out any
> cached data in the SLRUs). What we typically do with customers who
> need to recover from that situation today is give them a script to
> fabricate a bogus CLOG file that shows all transactions as committed
> (or, perhaps, aborted). But I think that the tools proposed on this
> thread might be a better approach in certain cases. If the problem is
> that a pg_clog file vanished, then recreating it with whatever content
> you think is closest to what was probably there before is likely the
> best you can do. But if you've got some individual tuples with crazy
> xmin values, you don't really want to drop matching files in pg_clog;
> it's better to fix the tuples.
>
> --
> Robert Haas
> EnterpriseDB: http://www.enterprisedb.com
> The Enterprise PostgreSQL Company
>
>
>
Attachment | Content-Type | Size |
---|---|---|
0001-Add-contrib-pg_surgery-to-perform-surgery-on-the-dam.patch | text/x-patch | 19.0 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Peter Eisentraut | 2020-07-24 09:13:28 | Re: renaming configure.in to configure.ac |
Previous Message | Tomas Vondra | 2020-07-24 08:40:47 | Re: Default setting for enable_hashagg_disk |