Re: [BUG] temporary file usage report with extended protocol and unnamed portals

From: Sami Imseih <samimseih(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Frédéric Yhuel <frederic(dot)yhuel(at)dalibo(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Benoit Lobréau <benoit(dot)lobreau(at)dalibo(dot)com>, Guillaume Lelarge <guillaume(dot)lelarge(at)dalibo(dot)com>, Pierrick Chovelon <pierrick(dot)chovelon(at)dalibo(dot)com>
Subject: Re: [BUG] temporary file usage report with extended protocol and unnamed portals
Date: 2025-04-19 22:42:53
Message-ID: CAA5RZ0ssqRTz_9T0Gy74SiTViiX3V0rSFxc4N_4GNcbEBK9wow@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

> Sami Imseih <samimseih(at)gmail(dot)com> writes:
> > I think the solution proposed by Frédéric seems reasonable: to switch
> > the debug_query_string
> > inside PortalDrop. However, I do not like the way the
> > debug_query_string changes values
> > after the CreatePortal call inside exec_bind_message; that seems incorrect.
> > So, I believe we should temporarily switch the debug_query_string
> > value only while
> > running PortalDrop. Attached is what I think could be safer to do.
> > What do you think?
>
> I don't think this is safe at all. The portal's query string
> is potentially shorter-lived than the portal, see in particular
> exec_simple_query which just passes a pointer to the original
> string in MessageContext.

Yes, you are correct. The comments about MessageContect in
mmgr/README give me pause.

"....
This
is kept separate from per-transaction and per-portal contexts because a
query string might need to live either a longer or shorter time than any
single transaction or portal.
"""

> I'm frankly inclined to do nothing, but if we must do something,
> the way to fix it here would be to transiently set debug_query_string
> to NULL so that the actions of PortalDrop aren't blamed on any
> particular query. (In my testing, the "temporary file:" message comes
> out without any attached STATEMENT most of the time already, so this
> isn't losing much as far as that's concerned.)

moreover, as I was looking into why the case mentioned earlier

```
SELECT FROM foo ORDER BY a \bind
;
SELECT 1 \bind
;
```

does not show STATEMENT after the temp file logging, I realized
it's because the temp files are cleaned up and reported at
the end of transaction, which means that debug_query_string is NULL at the
time the portal is dropped in the next query. This causes
check_log_of_query to return false.

/* query string available? */
if (debug_query_string == NULL)
return false;

> Perhaps a cleaner answer is to rearrange things in postgres.c
> so that if there's a pre-existing unnamed portal, we drop that
> before we ever set debug_query_string and friends at all.

That did cross my mind as well, but I was trying to avoid doing this
type of rearranging. I still rather not go down that path considering the
case mentioned above will still not display the query text in a STATEMENT log.

> I'm frankly inclined to do nothing, but if we must do something,
> the way to fix it here would be to transiently set debug_query_string
> to NULL so that the actions of PortalDrop aren't blamed on any
> particular query.

I think this is better, because I rather we avoid lines like the below in which
there are temp files being reported all with STATEMENT logging
of a different query. It's better to just not show STATEMENT at all.

```
2025-04-19 16:44:38.082 CDT [38963] STATEMENT: SELECT 1
2025-04-19 16:44:38.115 CDT [38963] LOG: temporary file: path
"base/pgsql_tmp/pgsql_tmp38963.1", size 1073741824
2025-04-19 16:44:38.115 CDT [38963] STATEMENT: SELECT 1
2025-04-19 16:44:38.149 CDT [38963] LOG: temporary file: path
"base/pgsql_tmp/pgsql_tmp38963.2", size 1073741824
2025-04-19 16:44:38.149 CDT [38963] STATEMENT: SELECT 1
2025-04-19 16:44:38.305 CDT [38963] LOG: temporary file: path
"base/pgsql_tmp/pgsql_tmp38963.3", size 1073741824
2025-04-19 16:44:38.305 CDT [38963] STATEMENT: SELECT 1
2025-04-19 16:44:38.558 CDT [38963] LOG: temporary file: path
"base/pgsql_tmp/pgsql_tmp38963.4", size 1073741824
2025-04-19 16:44:38.558 CDT [38963] STATEMENT: SELECT 1
2025-04-19 16:44:38.744 CDT [38963] LOG: temporary file: path
"base/pgsql_tmp/pgsql_tmp38963.5", size 1073741824
```

--
Sami Imseih
Amazon Web Services (AWS)

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2025-04-19 22:57:39 Re: Rename injection point names in test_aio
Previous Message Chapman Flack 2025-04-19 21:35:44 Re: transforms