Re: segfault in hot standby for hash indexes

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Ashutosh Sharma <ashu(dot)coek88(at)gmail(dot)com>
Cc: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: segfault in hot standby for hash indexes
Date: 2017-03-22 03:11:48
Message-ID: CAA4eK1LyzMEbDNNGLVW8SWBsg3ZoBfYWT9mHc4ThHTQFU4inHA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Mar 21, 2017 at 11:49 PM, Ashutosh Sharma <ashu(dot)coek88(at)gmail(dot)com> wrote:
>>
>> I can confirm that that fixes the seg faults for me.
>
> Thanks for confirmation.
>
>>
>> Did you mean you couldn't reproduce the problem in the first place, or that
>> you could reproduce it and now the patch fixes it? If the first of those, I
>> forget to say you do have to wait for hot standby to reach a consistency and
>> open for connections, and then connect to the standby ("psql -p 9874"),
>> before the seg fault will be triggered.
>
> I meant that I was not able to reproduce the issue on HEAD.
>
>>
>> But, there are places where hash_xlog_vacuum_get_latestRemovedXid diverges
>> from btree_xlog_delete_get_latestRemovedXid, which I don't understand the
>> reason for the divergence. Is there a reason we dropped the PANIC if we
>> have not reached consistency?
>
> Well, I'm not quite sure how would standby allow any backend to
> connect to it until it has reached to a consistent state. If you see
> the definition of btree_xlog_delete_get_latestRemovedXid(), just
> before consistency check there is a if-condition 'if
> (CountDBBackends(InvalidOid) == 0)' which means
> we are checking for consistent state only after knowing that there are
> some backends connected to the standby. So, Is there a possibility of
> having some backend connected to standby server without having it in
> consistent state.
>

I don't think so, but I think we should have reachedConsistency check
and elog(PANIC,..) similar to btree. If you see other conditions
where we PANIC in btree or hash xlog code, you will notice that those
are also theoretically not possible cases. It seems this is to save
database from getting corrupt or behaving insanely if due to some
reason (like a coding error or others) the check fails.

In a quick look, I don't find any other divergence in both the
function, is there any other divergence in both functions, if so, I
think we should at the very least mention something about it in the
function header.

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Pavan Deolasee 2017-03-22 03:13:58 Re: Patch: Write Amplification Reduction Method (WARM)
Previous Message Tom Lane 2017-03-22 03:02:37 Re: [HACKERS] Questionable tag usage