Re: Simplify the logic a bit (src/bin/scripts/reindexdb.c)

From: Álvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>
To: Ranier Vilela <ranier(dot)vf(at)gmail(dot)com>
Cc: Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, Alexander Korotkov <akorotkov(at)postgresql(dot)org>, Maxim Orlov <orlovmg(at)gmail(dot)com>, Svetlana Derevyanko <s(dot)derevyanko(at)postgrespro(dot)ru>
Subject: Re: Simplify the logic a bit (src/bin/scripts/reindexdb.c)
Date: 2025-03-07 18:53:57
Message-ID: 202503071853.5fd6bjxakhe2@alvherre.pgsql
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hello

I find that this is still quite broken -- both the original, and your
patch. I have already complained about a fundamental bug in the
original in [1]. In addition to what I reported there, in the unpatched
code I noticed that we're wasting memory and CPU by comparing the
qualified table name, when it would be faster to just store the table
OID and do the comparison based on that. Because the REINDEX INDEX
query only needs to specify the *index* name, not the table name, we
don't need the table name to be stored in the indices_tables_list: we
can convert it into an OID list and store that instead. The strcmp
becomes a numeric comparison. Yay. This is of course a pretty trivial,
almost meaningless change.

[1] https://postgr.es/m/202503071820.j25zn3lo4hvn@alvherre.pgsql

But your patch also makes the mistake that indices_table_list is passed
as a pointer by value, so the list that is filled by
get_parallel_tabidx_list() can never have any effect. I wonder if you
tested this patch at all. With your patch and the sample setup I posted
in [1], the program doesn't do anything. It never calls REINDEX at all.
It is a dead program. It's so bogus that in fact, there's no program,
it's just a bug that takes a lot of disk space.

For this to work, we need to pass that list (the output list) as a
pointer reference, so that the caller can _receive_ the output list.

We also get this compiler warning:

/pgsql/source/master/src/bin/scripts/reindexdb.c: In function ‘get_parallel_tabidx_list’:
/pgsql/source/master/src/bin/scripts/reindexdb.c:782:17: warning: this ‘if’ clause does not guard... [-Wmisleading-indentation]
782 | if (cell != index_list->head)
| ^~
/pgsql/source/master/src/bin/scripts/reindexdb.c:785:25: note: ...this statement, but the latter is misleadingly indented as if it were guarded by the ‘if’
785 | appendQualifiedRelation(&catalog_query, cell->val, conn, echo);
| ^~~~~~~~~~~~~~~~~~~~~~~

Not to mention the 'git apply' warnings:

I schmee: master 0$ git apply /tmp/v3-simplifies-reindex-one-database-reindexdb.patch
/tmp/v3-simplifies-reindex-one-database-reindexdb.patch:191: space before tab in indent.
fmtQualifiedIdEnc(PQgetvalue(res, i, 1),
/tmp/v3-simplifies-reindex-one-database-reindexdb.patch:192: space before tab in indent.
PQgetvalue(res, i, 0),
/tmp/v3-simplifies-reindex-one-database-reindexdb.patch:193: space before tab in indent.
PQclientEncoding(conn)));
advertencia: 3 líneas agregan errores de espacios en blanco.

Anyway, my version of this is attached. It fixes the problems with your
patch, but not Orlov's fundamental bug. Without fixing that bug, this
program does not deserve this supposedly parallel mode that doesn't
actually deliver. I wonder if we shouldn't just revert 47f99a407de2
completely.

You, on the other hand, really need to be much more careful with the
patches you submit.

--
Álvaro Herrera 48°01'N 7°57'E — https://www.EnterpriseDB.com/
"Selbst das größte Genie würde nicht weit kommen, wenn es
alles seinem eigenen Innern verdanken wollte." (Johann Wolfgang von Goethe)
Ni aún el genio más grande llegaría muy lejos si
quisiera sacarlo todo de su propio interior.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Ranier Vilela 2025-03-07 18:57:55 Re: Simplify the logic a bit (src/bin/scripts/reindexdb.c)
Previous Message Joel Jacobson 2025-03-07 18:51:53 Re: INSERT ... ON CONFLICT DO SELECT [FOR ...] take 2