From: | Andrey Borodin <amborodin86(at)gmail(dot)com> |
---|---|
To: | Peter Geoghegan <pg(at)bowt(dot)ie> |
Cc: | Mark Dilger <mark(dot)dilger(at)enterprisedb(dot)com>, Jose Arthur Benetasso Villanova <jose(dot)arthur(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Andrey Borodin <x4mmm(at)yandex-team(dot)ru>, Nikolay Samokhvalov <samokhvalov(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Subject: | Re: Amcheck verification of GiST and GIN |
Date: | 2023-03-19 23:00:14 |
Message-ID: | CAAhFRxhfhk1g+YkQsduikMKwetC-h4cbKAfWQOH3aYn-XzAkBg@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Fri, Mar 17, 2023 at 8:40 PM Andrey Borodin <amborodin86(at)gmail(dot)com> wrote:
>
> On Thu, Mar 16, 2023 at 6:23 PM Peter Geoghegan <pg(at)bowt(dot)ie> wrote:
> >
> > existence of a "same" routine hints at some confusion about "equality
> > versus equivalence" issues.
>
> Hmm...yes, actually, GiST deals with floats routinely. And there might
> be some sorts of NaNs and Infs that are equal, but not binary
> equivalent.
> I'll think more about it.
>
> gist_get_adjusted() calls "same" routine, which for type point will
> use FPeq(double A, double B). And this might be kind of a corruption
> out of the box. Because it's an epsilon-comparison, ε=1.0E-06.
> GiST might miss newly inserted data, because the "adjusted" tuple was
> "same" if data is in proximity of 0.000001 of any previously indexed
> point, but out of known MBRs.
> I'll try to reproduce this tomorrow, so far no luck.
>
After several attempts to corrupt GiST with this 0.000001 epsilon
adjustment tolerance I think GiST indexing of points is valid.
Because intersection for search purposes is determined with the same epsilon!
So it's kind of odd
postgres=# select point(0.0000001,0)~=point(0,0);
?column?
----------
t
(1 row)
, yet the index works correctly.
On Thu, Mar 16, 2023 at 4:48 PM Peter Geoghegan <pg(at)bowt(dot)ie> wrote:
>
> On Sun, Feb 5, 2023 at 4:45 PM Andrey Borodin <amborodin86(at)gmail(dot)com> wrote:
> > Here's v24 == (v23 + a step for pg_amcheck). There's a lot of
> > shotgun-style changes, but I hope next index types will be easy to add
> > now.
>
> Some feedback on the GiST patch:
>
> * You forgot to initialize GistCheckState.heaptuplespresent to 0.
>
> It might be better to allocate GistCheckState dynamically, using
> palloc0(). That's future proof. "Simple and obvious" is usually the
> most important goal for managing memory in amcheck code. It can be a
> little inefficient if that makes it simpler.
Done.
> * ISTM that gist_index_check() should allow the caller to omit a
> "heapallindexed" argument by specifying "DEFAULT FALSE", for
> consistency with bt_index_check().
Done.
> * What's the point in having a custom memory context that is never reset?
The problem is we traverse index with depth-first scan and must retain
internal tuples for a whole time of the scan.
And gistgetadjusted() will allocate memory only in case of suspicion
of corruption. So, it's kind of an infrequent case.
The context is there only as an overall leak protection mechanism.
Actual memory management is done via pfree() calls.
> Again, "simple and obvious" is good for memory management in amcheck.
Yes, that would be great to come up with some "unit of work" contexts.
Yet, now palloced tuples and scan items have very different lifespans.
> * ISTM that it would be clearer if the per-page code within
> gist_check_parent_keys_consistency() was broken out into its own
> function -- a little like bt_target_page_check()..
I've refactored page logic into gist_check_page().
> * ISTM that gist_refind_parent() should throw an error about
> corruption in the event of a parent page somehow becoming a leaf page.
Done.
> * I suggest using c99 style variable declarations in loops.
Done.
On Thu, Mar 16, 2023 at 6:23 PM Peter Geoghegan <pg(at)bowt(dot)ie> wrote:
>
> On Thu, Mar 16, 2023 at 4:48 PM Peter Geoghegan <pg(at)bowt(dot)ie> wrote:
> > Some feedback on the GiST patch:
>
> I see that the Bloom filter that's used to implement heapallindexed
> verification fingerprints index tuples that are formed via calls to
> gistFormTuple(), without any attempt to normalize-away differences in
> TOAST input state. In other words, there is nothing like
> verify_nbtree.c's bt_normalize_tuple() function involved in the
> fingerprinting process. Why is that safe, though? See the "toast_bug"
> test case within contrib/amcheck/sql/check_btree.sql for an example of
> how inconsistent TOAST input state confused verify_nbtree.c's
> heapallindexed verification (before bugfix commit eba775345d). I'm
> concerned about GiST heapallindexed verification being buggy in
> exactly the same way, or in some way that is roughly analogous.
FWIW contrib opclasses, AFAIK, always detoast possibly long datums,
see gbt_var_compress()
https://github.com/postgres/postgres/blob/master/contrib/btree_gist/btree_utils_var.c#L281
But there might be opclasses that do not do so...
Also, there are INCLUDEd attributes. Right now we just put them as-is
to the bloom filter. Does this constitute a TOAST bug as in B-tree?
If so, I think we should use a version of tuple formatting that omits
included attributes...
What do you think?
>
> I do have some concerns about there being analogous problems that are
> unique to GiST, since GiST is an AM that gives opclass authors many
> more choices than B-Tree opclass authors have. In particular, I wonder
> if heapallindexed verification needs to account for how GiST
> compression might end up breaking heapallindexed. I refer to the
> "compression" implemented by GiST support routine 3 of GiST opclasses.
> The existence of GiST support routine 7, the "same" routine, also
> makes me feel a bit squeamish about heapallindexed verification -- the
> existence of a "same" routine hints at some confusion about "equality
> versus equivalence" issues.
>
> In more general terms: heapallindexed verification works by
> fingerprinting index tuples during the index verification stage, and
> then performing Bloom filter probes in a separate CREATE INDEX style
> heap-matches-index stage (obviously). There must be some justification
> for our assumption that there can be no false positive corruption
> reports due only to a GiST opclass (either extant or theoretical) that
> follows the GiST contract, and yet allows an inconsistency to arise
> that isn't really index corruption. This justification won't be easy
> to come up with, since the GiST contract was not really designed with
> these requirements in mind. But...we should try to come up with
> something.
>
> What are the assumptions underlying heapallindexed verification for
> GiST? It doesn't have to be provably correct or anything, but it
> should at least be empirically falsifiable. Basically, something that
> says: "Here are our assumptions, if we were wrong in making these
> assumptions then you could tell that we made a mistake because of X,
> Y, Z". It's not always clear when something is corrupt. Admittedly I
> have much less experience with GiST than other people, which likely
> includes you (Andrey). I am likely missing some context around the
> evolution of GiST. Possibly I'm making a big deal out of something
> without it being unhelpful. Unsure.
>
> Here is an example of the basic definition of correctness being
> unclear, in a bad way: Is a HOT chain corrupt when its root
> LP_REDIRECT points to an LP_DEAD item, or does that not count as
> corruption? I'm pretty sure that the answer is ambiguous even today,
> or was ambiguous until recently, at least. Hopefully the
> verify_heapam.c HOT chain verification patch will be committed,
> providing us with a clear *definition* of HOT chain corruption -- the
> definition itself may not be the easy part.
Rules for compression methods are not described anyware. And I suspect
that it's intentional, to provide more flexibility.
To make heapallindexed check work we need that opclass always returns
the same compression result for the same input datum.
All known to me opclasses (built-in and PostGIS) comply with this requirement.
Yet another behavior might be reasonable. Consider we have a
compression which learns on data. It will observe that some datums are
more frequent and start using shorter version of them.
Compression function actually is not about compression, but kind of a
conversion from heap format to indexable. Many opclasses do not have a
compression function at all.
We can require that the checked opclass would not have a compression
function at all. But GiST is mainly used for PostGIS, and in PostGIS
they use compression to convert complex geometry into a bounding box.
Method "same" is used only for a business of internal tuples, but not
for leaf tuples that we fingerprint in the bloom filter.
We can put requirements for heapallindexed in another way: "the
opclass compression method must be a pure function". It's also a very
strict requirement, disallowing all kinds of detoasting, dictionary
compression etc. And btree_gist opclasses does not comply :) But they
seem to me safe for heapallindexed.
> On a totally unrelated note: I wonder if we should be checking that
> internal page tuples have 0xffff as their offset number? Seems like
> it'd be a cheap enough cross-check.
>
Done.
Thank you!
Best regards, Andrey Borodin.
Attachment | Content-Type | Size |
---|---|---|
v25-0003-Add-gin_index_parent_check-to-verify-GIN-index.patch | application/octet-stream | 32.3 KB |
v25-0004-Add-GiST-support-to-pg_amcheck.patch | application/octet-stream | 30.7 KB |
v25-0001-Refactor-amcheck-to-extract-common-locking-routi.patch | application/octet-stream | 18.6 KB |
v25-0002-Add-gist_index_check-function-to-verify-GiST-ind.patch | application/octet-stream | 29.7 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Michael Paquier | 2023-03-19 23:43:27 | Re: Reconcile stats in find_tabstat_entry() and get rid of PgStat_BackendFunctionEntry |
Previous Message | Melanie Plageman | 2023-03-19 22:59:12 | Re: Option to not use ringbuffer in VACUUM, using it in failsafe mode |