From: | Peter Geoghegan <pg(at)bowt(dot)ie> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Andres Freund <andres(at)anarazel(dot)de> |
Cc: | PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Non-deterministic IndexTuple toast compression from index_form_tuple() + amcheck false positives |
Date: | 2019-01-23 18:59:55 |
Message-ID: | CAH2-Wz=nG9O+D_XwmHBxEZC8wdyB13--bBWunVSx9oYA1QznNA@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Mon, Jan 14, 2019 at 2:37 PM Peter Geoghegan <pg(at)bowt(dot)ie> wrote:
> The fix here must be to normalize index tuples that are compressed
> within amcheck, both during initial fingerprinting, and during
> subsequent probes of the Bloom filter in bt_tuple_present_callback().
I happened to talk to Andres about this in person yesterday. He
thought that there was reason to be concerned about the need for
logical normalization beyond TOAST issues. Expression indexes were a
particular concern, because they could in principle have a change in
the on-disk representation without a change of logical values -- false
positives could result. He suggested that the long term solution was
to bring hash operator class hash functions into Bloom filter hashing,
at least where available.
I wasn't very enthused about this idea, because it will be expensive
and complicated for an uncertain benefit. There are hardly any btree
operator classes that can ever have bitwise distinct datums that are
equal, anyway (leaving aside issues with TOAST). For the cases that do
exist (e.g. numeric_ops display scale), we may not really want to
normalize the differences away. Having an index tuple with a
numeric_ops datum containing the wrong display scale but with
everything else correct still counts as corruption.
It now occurs to me that if we wanted to go further than simply
normalizing away TOAST differences, my pending nbtree patch could
enable a simpler and more flexible way of doing that than bringing
hash opclasses into it, at least on the master branch. We could simply
do an index look-up for the exact tuple of interest in the event of a
Bloom filter probe indicating its apparent absence (corruption) --
even heap TID can participate in the search. In addition, that would
cover the whole universe of logical differences, known and unknown
(e.g. it wouldn't matter if somebody initialized alignment padding to
something non-zero, since that doesn't cause wrong answers to
queries). We might even want to offer an option where the Bloom filter
is bypassed (we go straight to probing the indexes) some proportion of
the time, or when a big misestimation when sizing the Bloom filter is
detected (i.e. almost all bits in the Bloom filter bitset are set at
the time we start probing the filter).
--
Peter Geoghegan
From | Date | Subject | |
---|---|---|---|
Next Message | Donald Dong | 2019-01-23 19:26:40 | Re: Analyze all plans |
Previous Message | Chapman Flack | 2019-01-23 18:36:36 | Re: ArchiveEntry optional arguments refactoring |