Quick Links

Re: Index corruption / planner issue with one table in my pg 11.6 instance

From:	Jeremy Finzel <finzelj(at)gmail(dot)com>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	Jerry Sievers <gsievers19(at)comcast(dot)net>, Peter Geoghegan <pg(at)bowt(dot)ie>, Michael Paquier <michael(at)paquier(dot)xyz>, Jeff Janes <jeff(dot)janes(at)gmail(dot)com>, PostgreSQL mailing lists <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Index corruption / planner issue with one table in my pg 11.6 instance
Date:	2019-12-10 14:25:18
Message-ID:	CAMa1XUjHNMMtMF=ZtvSb4uF1y=SXu+V9AmEWT+FtFfFs3SLrTg@mail.gmail.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

On Tue, Dec 10, 2019 at 12:09 AM Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:

> Yeah. The reported behavior can mostly be explained if we assume
> that there's some HOT chain in the table that involves an update
> of this particular column, so that if we build an index on that
> column we see a broken HOT chain, but building an index on some
> other column doesn't have a problem.
>

The problem exists so far as I can tell on indexing *any column* of *this
particular table*. I tried same experiment on another table in the same
replication stream, and I cannot reproduce it.

I am building the index **non-concurrently** every time.

> The thing this doesn't easily explain is that the behavior persists
> across repeated index rebuilds. A broken HOT chain is only broken
> as long as the older entry is still visible-to-somebody, so that
> such situations ought to be self-healing as time passes. If it
> fails repeatedly, this theory requires assuming that either
>
> 1. You've got some extremely old open transactions (maybe forgotten
> prepared transactions?), or
>

No prepared_xacts and no transactions older than a few hours. Several hour
transactions are common in this reporting system. I have not yet seen if
after several hours the index starts showing up in plans.

> 2. Your workload is constantly generating new broken HOT chains of
> the same sort, so that there's usually a live one when you try
> to build an index.
>
> The fact that you even notice the indcheckxmin restriction indicates
> that you do tend to have long-running transactions in the system,
> else the index would come free for use fairly quickly. So #1 isn't
> as implausible as I might otherwise think. But #2 seems probably
> more likely on the whole. OTOH, neither point is exactly within
> the offered evidence.
>

Is there a way for me to test this theory? I tried the following with no
change in behavior:

1. Disable write load to table
2. Vacuum analyze table (not vac full)
3. Create index
4. Explain

Still did not pick up the index.

Thanks,
Jeremy

In response to

Re: Index corruption / planner issue with one table in my pg 11.6 instance at 2019-12-10 06:09:21 from Tom Lane

Responses

Re: Index corruption / planner issue with one table in my pg 11.6 instance at 2019-12-12 13:52:14 from Jeremy Finzel

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Asif Rehman	2019-12-10 14:33:48	Re: WIP/PoC for parallel backup
Previous Message	Pavel Stehule	2019-12-10 13:47:03	Re: proposal: minscale, rtrim, btrim functions for numeric