From: | Jean Landercy - BEEODIVERSITY <jean(dot)landercy(at)beeodiversity(dot)com> |
---|---|
To: | "pgsql-hackers(at)lists(dot)postgresql(dot)org" <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Cc: | "andersk(at)mit(dot)edu" <andersk(at)mit(dot)edu> |
Subject: | Sudden database error with COUNT(*) making Query Planner crashes: variable not found in subplan target list |
Date: | 2022-06-06 09:34:24 |
Message-ID: | VI1PR09MB438296EF87338CF6606F03FEE7A29@VI1PR09MB4382.eurprd09.prod.outlook.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Dear PostgreSQL Hacker Community,
I am facing a tricky bug which makes the Query Planner crashes when using COUNT(*) function.
Without any upgrade suddenly a table of a database instance could not be queried this way:
SELECT COUNT(*) FROM items;
-- ERROR: variable not found in subplan target list
-- SQL state: XX000
Message and behaviour seem related to the Query Planner:
EXPLAIN SELECT COUNT(*) FROM item;
-- ERROR: variable not found in subplan target list
-- SQL state: XX000
Looks like a column name could not be found (see https://github.com/postgres/postgres/blob/ce4f46fdc814eb1b704d81640f6d8f03625d0f53/src/backend/optimizer/plan/setrefs.c#L2967-L2972) in some specific context that is somehow hard to reproduce.
Interesting facts:
SELECT COUNT(id) FROM items; -- 213
SELECT COUNT(*) FROM items WHERE id > 0; -- 213
Work as expected.
I can see that other people are recently facing a similar problem (https://www.postgresql.org/message-id/flat/4c347490-d734-5fdd-d613-1327601b4e7e%40mit.edu)
If it is the same bug then it is not related to the PGroonga extension as I don't use it all.
Anyway, the bug is difficult to reproduce on my application.
At the time of writing, I could just isolate it on a specific database but I could not draw a MCVE from it.
I am looking for help to make it reproducible and feed your knowledge database.
My first guess was to open a post of SO (see for details https://stackoverflow.com/questions/72498741/how-can-i-reproduce-a-database-context-to-debug-a-tricky-postgresql-error-vari) but digging deeper in the investigation it seems it will require people with strong insights on how PostgreSQL actually works under the hood.
Therefore, I chose this specific mailing list.
The bug is tricky to reproduce, I could not succeed to replicate elsewhere (dump/restore does not preserve it).
Anyway it makes my database unusable and looks like a potential bug for your product and applications relying on it.
Faulty setup is about:
SELECT version();
-- PostgreSQL 13.6 on x86_64-pc-linux-gnu, compiled by gcc (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0, 64-bit
SELECT extname, extversion FROM pg_extension;
-- "plpgsql" "1.0"
-- "postgis" "3.1.1"
By now, the only workarounds I have found are:
* Dump database and recreate a new instance (problem seems to vanish but there is no guarantee it is solved or it will not happened later on);
* Add dummy filter on all queries (more a trick than a solution).
I am writing to this mailing list to raise you attention on it.
I'll be happy to help you investigate it deeper.
Best regards,
Landercy Jean
From | Date | Subject | |
---|---|---|---|
Next Message | kuroda.hayato@fujitsu.com | 2022-06-06 10:54:21 | RE: Multi-Master Logical Replication |
Previous Message | shiy.fnst@fujitsu.com | 2022-06-06 08:58:57 | RE: Handle infinite recursion in logical replication setup |