Re: new heapcheck contrib module

From: Mark Dilger <mark(dot)dilger(at)enterprisedb(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Peter Geoghegan <pg(at)bowt(dot)ie>, "Andrey M(dot) Borodin" <x4mmm(at)yandex-team(dot)ru>, Stephen Frost <sfrost(at)snowman(dot)net>, Michael Paquier <michael(at)paquier(dot)xyz>, Amul Sul <sulamul(at)gmail(dot)com>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: new heapcheck contrib module
Date: 2021-01-28 05:05:02
Message-ID: 411C72E5-23A6-485F-A101-FBF7CFA6D2D5@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

> On Jan 14, 2021, at 1:13 PM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>
> On Mon, Jan 11, 2021 at 1:16 PM Mark Dilger
> <mark(dot)dilger(at)enterprisedb(dot)com> wrote:
>> Added in v32, along with adding pg_amcheck to @contrib_uselibpq, @contrib_uselibpgport, and @contrib_uselibpgcommon
>
> exit_utils.c fails to achieve the goal of making this code independent
> of pg_dump, because of:
>
> #ifdef WIN32
> if (parallel_init_done && GetCurrentThreadId() != mainThreadId)
> _endthreadex(code);
> #endif
>
> parallel_init_done is a pg_dump-ism. Perhaps this chunk of code could
> be a handler that gets registered using exit_nicely() rather than
> hard-coded like this. Note that the function comments for
> exit_nicely() are heavily implicated in this problem, since they also
> apply to stuff that only happens in pg_dump and not other utilities.

The 0001 patch has been restructured to not have this problem.

> I'm skeptical about the idea of putting functions into string_utils.c
> with names as generic as include_filter() and exclude_filter().
> Existing cases like fmtId() and fmtQualifiedId() are not great either,
> but I think this is worse and that we should do some renaming. On a
> related note, it's not clear to me why these should be classified as
> string_utils while stuff like expand_schema_name_patterns() gets
> classified as option_utils. These are neither generic
> string-processing functions nor are they generic options-parsing
> functions. They are functions for expanding shell-glob style patterns
> for database object names. And they seem like they ought to be
> together, because they seem to do closely-related things. I'm open to
> an argument that this is wrongheaded on my part, but it looks weird to
> me the way it is.

The logic to filter which relations are checked is completely restructured and is kept in pg_amcheck.c

> I'm pretty unimpressed by query_utils.c. The CurrentResultHandler
> stuff looks grotty, and you don't seem to really use it anywhere. And
> it seems woefully overambitious to me anyway: this doesn't apply to
> every kind of "result" we've got hanging around, absolutely nothing
> even close to that, even though a name like CurrentResultHandler
> sounds very broad. It also means more global variables, which is a
> thing of which the PostgreSQL codebase already has a deplorable
> oversupply. quiet_handler() and noop_handler() aren't used anywhere
> either, AFAICS.
>
> I wonder if it would be better to pass in callbacks rather than
> relying on global variables. e.g.:
>
> typedef void (*fatal_error_callback)(const char *fmt,...)
> pg_attribute_printf(1, 2) pg_attribute_noreturn();
>
> Then you could have a few helper functions that take an argument of
> type fatal_error_callback and throw the right fatal error for (a)
> wrong PQresultStatus() and (b) result is not one row. Do you need any
> other cases? exiting_handler() seems to think that the caller might
> want to allow any number of tuples, or any positive number, or any
> particular cout, but I'm not sure if all of those cases are really
> needed.

The error callback stuff has been refactored in this next patch set, and also now includes handlers for parallel slots, as the src/bin/scripts/scripts_parallel.c stuff has been moved to fe_utils and made more general. As it was, there were hardcoded assumptions that are valid for reindexdb and vacuumdb, but not general enough for pg_amcheck to use. The refactoring in patches 0002 through 0005 make it more generally usable. Patch 0008 uses it in pg_amcheck.

> This stuff is finnicky and hard to get right. You don't really want to
> create a situation where the same code keeps getting duplicated, or
> the behavior's just a little bit inconsistent everywhere, but it also
> isn't great to build layers upon layers of abstraction around
> something like ExecuteSqlQuery which is, in the end, a four-line
> function. I don't think there's any problem with something like
> pg_dump having it's own function to execute-a-query-or-die. Maybe that
> function ends up doing something like
> TheGenericFunctionToExecuteOrDie(my_die_fn, the_query), or maybe
> pg_dump can just open-code it but have a my_die_fn to pass down to the
> glob-expansion stuff, or, well, I don't know.

There are some real improvements in this next patch set.

The number of queries issued to the database to determine the databases to use is much reduced. I had been following the pattern in pg_dump, but abandoned that for something new.

The parallel slots stuff is now used for parallelism, much like what is done in vacuumdb and reindexdb.

The pg_amcheck application can now be run over one database, multiple specified databases, or all databases.

Relations, schemas, and databases can be included and excluded by pattern, like "(db1|db2|db3).myschema.(mytable|myindex)". The real-world use-cases for this that I have in mind are things like:

pg_amcheck --jobs=12 --all \
--exclude-relation="db7.schema.known_corrupt_table" \
--exclude-relation="db*.schema.known_big_table"

and

pg_amcheck --jobs=20 \
--include-relation="*.compliance.audited"

I might be missing something, but I think the interface is a superset of the interface from reindexdb and vacuumdb. None of the new interface stuff (patterns, allowing multiple databases to be given on the command line, etc) is required.

Attachment Content-Type Size
v33-0001-Moving-exit_nicely-and-fatal-into-fe_utils.patch application/octet-stream 13.1 KB
v33-0002-Introducing-PGresultHandler-abstraction.patch application/octet-stream 5.8 KB
v33-0003-Preparing-for-move-of-parallel-slot-infrastructu.patch application/octet-stream 21.3 KB
v33-0004-Moving-and-renaming-scripts_parallel.patch application/octet-stream 11.0 KB
v33-0005-Parameterizing-parallel-slot-result-handling.patch application/octet-stream 11.8 KB
v33-0006-Moving-handle_help_version_opts.patch application/octet-stream 8.5 KB
v33-0007-Refactoring-processSQLNamePattern.patch application/octet-stream 10.5 KB
v33-0008-Adding-contrib-module-pg_amcheck.patch application/octet-stream 126.3 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Justin Pryzby 2021-01-28 05:43:50 Re: Hybrid Hash/Nested Loop joins and caching results from subplans
Previous Message Amit Kapila 2021-01-28 04:54:24 Re: On login trigger: take three