Re: Unkillable processes creating millions of tiny temp files

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Jeremy Finzel <finzelj(at)gmail(dot)com>
Cc: Postgres General <pgsql-general(at)postgresql(dot)org>
Subject: Re: Unkillable processes creating millions of tiny temp files
Date: 2021-03-05 22:28:40
Message-ID: 1517658.1614983320@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Jeremy Finzel <finzelj(at)gmail(dot)com> writes:
> We are running postgres 11.9 (were running 11.7 prior to recent restart) on
> a large db (10s of TB) with 5 or 6 tablespaces and 1000s of tables/indexes.

> Within the past few days we have started to see a few queries running for
> over 8 hours which we then attempt to terminate, but will not terminate.
> These queries are also generating hundreds of thousands of tiny/empty temp
> files. In fact, before the restart there were over 23 million files in
> pg_tmp which were removed. We also have verified no server settings have
> changed for at least a couple weeks, well before this issue started
> happening only in the past few days.

Hm. For the query plan you show, I think the only plausible explanation
for using temp files is that a hash join's hash table is exceeding
work_mem so it's spilling batches of tuples to disk. With some bad
luck those could be small not large. But I don't see anything in
our commit logs between 11.7 and 11.9 that looks like it would have
affected any of that behavior. (There were some changes to code
governing temp-file tablespace selection, but that could only affect
where the files get put not how big they are.) So I doubt that this
can be blamed on the update, especially since if I read you correctly
it didn't start happening immediately after the update.

I'm wondering about changes in table statistics possibly causing a
poorly-chosen change in the hashing parameters. What have you got
work_mem set to? Can you comment on whether the estimated rowcounts
shown in the query plan are accurate? Does manually ANALYZE'ing the
tables used in the query change anything?

The "unkillable" aspect is odd, but I wonder if that's just a red
herring. A query that's generated lots of temp files will try to
clean them up at termination, so maybe the backend is just sitting
there removing temp files before it'll give control back.

regards, tom lane

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Jerry Sievers 2021-03-05 23:44:57 Re: Unkillable processes creating millions of tiny temp files
Previous Message Jeremy Finzel 2021-03-05 21:34:37 Unkillable processes creating millions of tiny temp files