Re: pg_amcheck contrib application

From: Mark Dilger <mark(dot)dilger(at)enterprisedb(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Noah Misch <noah(at)leadboat(dot)com>, Peter Geoghegan <pg(at)bowt(dot)ie>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, "Andrey M(dot) Borodin" <x4mmm(at)yandex-team(dot)ru>, Stephen Frost <sfrost(at)snowman(dot)net>, Michael Paquier <michael(at)paquier(dot)xyz>, Amul Sul <sulamul(at)gmail(dot)com>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pg_amcheck contrib application
Date: 2021-04-30 19:21:09
Message-ID: BC832D42-BA40-4F77-B139-1EDA9E509D00@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

> On Apr 30, 2021, at 11:56 AM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>
> On Fri, Apr 30, 2021 at 2:31 PM Mark Dilger
> <mark(dot)dilger(at)enterprisedb(dot)com> wrote:
>> Just to be clear, I did not use your patch v1 as the starting point.
>
> I thought that might be the case, but I was trying to understand what
> you didn't like about my version, and comparing them seemed like a way
> to figure that out.
>
>> I took the code as committed to master as the starting point, used your corruption report verbiage changes and at least some of your variable naming choices, but did not use the rest, in large part because it didn't work. It caused corruption messages to be reported against tables that have no corruption. For that matter, your v2 patch doesn't work either, and in the same way. To wit:
>>
>> heap table "postgres"."pg_catalog"."pg_rewrite", block 6, offset 4, attribute 7:
>> toast value 13461 chunk 0 has size 1995, but expected size 1996
>>
>> I think there is something wrong with the way you are trying to calculate and use extsize, because I'm not corrupting pg_catalog.pg_rewrite. You can get these same results by applying your patch to master, building, and running 'make check' from src/bin/pg_amcheck/
>
> Argh, OK, I didn't realize. Should be fixed in this version.
>
>>> 4. You don't return if chunk_seq > last_chunk_seq. That seems wrong,
>>> because we cannot compute a sensible expected size in that case. I
>>> think your code will subtract a larger value from a smaller one and,
>>> this being unsigned arithmetic, say that the expected chunk size is
>>> something gigantic.
>>
>> Your conclusion is probably right, but I think your analysis is based on a misreading of what "last_chunk_seq" means. It's not the last one seen, but the last one expected. (Should we rename the variable to avoid confusion?) It won't compute a gigantic size. Rather, it will expect *every* chunk with chunk_seq >= last_chunk_seq to have whatever size is appropriate for the last chunk.
>
> I realize it's the last one expected. That's the point: we don't have
> any expectation for the sizes of chunks higher than the last one we
> expected to see. If the value is 2000 bytes and the chunk size is 1996
> bytes, we expect chunk 0 to be 1996 bytes and chunk 1 to be 4 bytes.
> If not, we can complain. But it makes no sense to complain about chunk
> 2 being of a size we don't expect. We don't expect it to exist in the
> first place, so we have no notion of what size it ought to be.
>
>> If we have seen any chunks, the variable is holding the expected next chunk seq, which is one greater than the last chunk seq we saw.
>>
>> If we expect chunks 0..3 and see chunk 0 but not chunk 1, it will complain ..."expected to end at chunk 4, but ended at chunk 1". This is clearly by design and not merely a bug, though I tend to agree with you that this is a strange wording choice. I can't remember exactly when and how we decided to word the message this way, but it has annoyed me for a while, and I assumed it was something you suggested a while back, because I don't recall doing it. Either way, since you seem to also be bothered by this, I agree we should change it.
>
> Can you review this version?
>
> --
> Robert Haas
> EDB: http://www.enterprisedb.com
> <simply-remove-chunkno-concept-v3.patch>

As requested off-list, here are NOT FOR COMMIT, WIP patches for testing only.

The first patch allows toast tables to be updated and adds regression tests of corrupted toasted attributes. I never quite got deletes from toast tables to work, and there are probably other gotchas still lurking even with inserts and updates, but it limps along well enough for testing pg_amcheck.

The second patch updates the expected output of pg_amcheck to match the verbiage that you suggested upthread.

Attachment Content-Type Size
v1-0001-Adding-modify-toast-and-test-pg_amcheck.patch.WIP application/octet-stream 8.6 KB
v1-0002-Modifying-toast-corruption-test-expected-output.patch.WIP application/octet-stream 6.2 KB

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Mark Dilger 2021-04-30 19:26:36 Re: pg_amcheck contrib application
Previous Message Robert Haas 2021-04-30 19:19:56 Re: MaxOffsetNumber for Table AMs