Removing pgsql_tmp files

From: Michael Glaesemann <michael(dot)glaesemann(at)myyearbook(dot)com>
To: pgsql-general(at)postgresql(dot)org
Cc: Michael Glaesemann <michael(dot)glaesemann(at)myyearbook(dot)com>
Subject: Removing pgsql_tmp files
Date: 2010-11-08 20:37:50
Message-ID: DD030E3C-1892-446D-B3C5-4B55F789E7C7@myyearbook.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general pgsql-hackers

We've got over 250GB of files in a pgsql_tmp directory, some with modification timestamps going back to August 2010 when the server was last restarted.

select pg_postmaster_start_time();
pg_postmaster_start_time
-------------------------------
2010-08-08 22:53:31.999804-04
(1 row)

I'm not sure why the temp files aren't being cleaned up. I can confirm we haven't had a backend crash in at least a week (from inspection of the log files). The oldest backend goes back to the end of October:

production=# select current_timestamp, min(backend_start) from pg_stat_activity;
now | min
-------------------------------+-------------------------------
2010-11-08 15:23:25.331311-05 | 2010-10-27 05:51:02.707688-04
(1 row)

To confirm which files are no longer being used, I've come up with the following query (where 16384 is the tablespace in question):

SELECT filename, pid, (pg_stat_file(dir || '/' || filename)).modification AS modified_at
FROM (SELECT CAST('pg_tblspc/16384/pgsql_tmp' AS TEXT) as dir, filename,
CAST(regexp_replace(filename, $r$^pgsql_tmp(\d+)\..*$$r$, $rr$\1$rr$, 'g') AS INT) as pid
FROM pg_ls_dir('pg_tblspc/16384/pgsql_tmp') AS the (filename)) AS temp_files (dir, filename, pid)
LEFT JOIN pg_stat_activity ON procpid = pid
WHERE procpid IS NULL
ORDER BY modified_at;

filename | pid | modified_at
-------------------+-------+------------------------
pgsql_tmp29522.1 | 29522 | 2010-08-09 00:26:42-04
pgsql_tmp31962.0 | 31962 | 2010-08-09 00:45:41-04
pgsql_tmp29258.0 | 29258 | 2010-08-09 00:46:01-04
pgsql_tmp1478.0 | 1478 | 2010-08-09 00:46:16-04
pgsql_tmp1482.2 | 1482 | 2010-08-09 00:46:18-04
pgsql_tmp29267.1 | 29267 | 2010-08-09 01:02:34-04
...

pgsql_tmp21928.0 | 21928 | 2010-11-08 00:32:48-05
pgsql_tmp20825.0 | 20825 | 2010-11-08 02:17:51-05
pgsql_tmp878.0 | 878 | 2010-11-08 02:25:23-05
pgsql_tmp8064.0 | 8064 | 2010-11-08 03:47:26-05
pgsql_tmp31645.0 | 31645 | 2010-11-08 07:09:40-05
pgsql_tmp25245.0 | 25245 | 2010-11-08 09:33:24-05
pgsql_tmp2302.3 | 2302 | 2010-11-08 09:59:45-05
pgsql_tmp17961.0 | 17961 | 2010-11-08 11:29:12-05
(2685 rows)

Does this query look reasonable? What other things should I take into account before I start deleting files from the file system? Why might these files not be cleaned up on their own?

Cheers,
Michael Glaesemann
michael(dot)glaesemann(at)myyearbook(dot)com

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Jakub Ouhrabka 2010-11-08 20:39:45 Re: ERROR: Out of memory - when connecting to database
Previous Message Tom Lane 2010-11-08 20:31:58 Re: ERROR: Out of memory - when connecting to database

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2010-11-08 20:38:16 Re: Should we use make -k on the buildfarm?
Previous Message Tom Lane 2010-11-08 20:27:14 Re: UNION ALL has higher cost than inheritance