From: | Tim Kane <tim(dot)kane(at)gmail(dot)com> |
---|---|
To: | Jeff Janes <jeff(dot)janes(at)gmail(dot)com> |
Cc: | pgsql-general General <pgsql-general(at)postgresql(dot)org> |
Subject: | Re: pg_stat_tmp |
Date: | 2013-12-17 11:25:06 |
Message-ID: | CED5DA42.51EDB%tim.kane@gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
Thanks Jeff, Magnus
Thanks for the suggestions.
This morning the same issue occurred, but this time it also complained that
it failed to write to pg_clog (previous days, the only failure was in
writing to pg_stat_tmp)
2013-12-17 07:13:04 GMT DETAIL: Could not write to file "pg_clog/0004" at
offset 212992: No space left on device.
2013-12-17 07:13:05 GMT ERROR: could not access status of transaction 0
2013-12-17 07:13:05 GMT DETAIL: Could not write to file "pg_clog/0004" at
offset 212992: No space left on device.
2013-12-17 07:13:06 GMT ERROR: could not access status of transaction 0
2013-12-17 07:13:06 GMT DETAIL: Could not write to file "pg_clog/0004" at
offset 212992: No space left on device.
2013-12-17 07:13:07 GMT LOG: could not write temporary statistics file
"pg_stat_tmp/pgstat.tmp": No space left on device
2013-12-17 07:13:07 GMT LOG: could not write temporary statistics file
"pg_stat_tmp/pgstat.tmp": No space left on device
2013-12-17 07:13:08 GMT LOG: could not write temporary statistics file
"pg_stat_tmp/pgstat.tmp": No space left on device
2013-12-17 07:13:08 GMT LOG: could not write temporary statistics file
"pg_stat_tmp/pgstat.tmp": No space left on dev2013-12-17 07:25:15 GMT
WARNING: terminating connection because of crash of another server
process
2013-12-17 07:25:15 GMT DETAIL: The postmaster has commanded this server
process to roll back the current transaction and exit, because another
server process exited abnormally and possibly corrupted shared m
emory.
2013-12-17 07:25:15 GMT HINT: In a moment you should be able to reconnect
to the database and repeat your command.
2013-12-17 07:25:16 GMT LOG: all server processes terminated;
reinitializing
2013-12-17 07:25:17 GMT LOG: database system was interrupted; last known up
at 2013-12-17 07:08:22 GMT
2013-12-17 07:25:17 GMT LOG: database system was not properly shut down;
automatic recovery in progress
2013-12-17 07:25:17 GMT LOG: redo starts at 5BC/CF3F9888
2013-12-17 07:25:18 GMT LOG: could not open file
"pg_xlog/00000001000005BC000000D6" (log file 1468, segment 214): No such
file or directory
2013-12-17 07:25:18 GMT LOG: redo done at 5BC/D5FFFEC0
2013-12-17 07:25:18 GMT LOG: last completed transaction was at log time
2013-12-17 07:25:12.989653+00
2013-12-17 07:25:19 GMT LOG: database system is ready to accept connections
2013-12-17 07:25:19 GMT LOG: autovacuum launcher started
It’s never failed to write pg_xlog’s though – or at least, never reported a
failure.
For now, I’ve altered the pg_stat_tmp path and we’ll see what happens
tomorrow – Ideally though, yes.. I’d like to see this happening at runtime
and get a better picture of what’s blowing out here.
Further to this however, I notice that the base directory is consuming more
disk than I would have expected it to (all our relations are stored in
tablespaces assigned to other disks).
Looking for the larger culprits, I note the following files consuming 4.4GB
9.2/main/base/74641/2260957
9.2/main/base/74641/2260957.1
9.2/main/base/74641/2260957.2
9.2/main/base/74641/2260957.3
9.2/main/base/74641/2260957.4
I notice the timestamps are just shy of the time we experience that
pg_stat_tmp issue.
-rw------- 1 postgres postgres 1073741824 Dec 17 06:56 2260957
-rw------- 1 postgres postgres 1073741824 Dec 17 06:56 2260957.1
-rw------- 1 postgres postgres 1073741824 Dec 17 06:56 2260957.2
-rw------- 1 postgres postgres 1073741824 Dec 17 06:56 2260957.3
-rw------- 1 postgres postgres 328466432 Dec 17 06:56 2260957.4
They appear to be orphaned oid’s, though I’m not sure this is definitive:
clone=# select pg_relation_filepath(2260957);
pg_relation_filepath
----------------------
(1 row)
I probably would have thought these to be temporary files – except for that
I have temp_tablespace set elsewhere.
Are these truly orphaned files that postgresql no longer needs (possibly due
to a crash) ? Is there a mechanism to scan the PG_DATA directories to
locate such files?
FYI, we’re running 9.2.4
Thanks again,
Tim
From: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
Date: Monday, 16 December 2013 17:32
To: Tim Kane <tim(dot)kane(at)gmail(dot)com>
Cc: pgsql-general General <pgsql-general(at)postgresql(dot)org>
Subject: Re: [GENERAL] pg_stat_tmp
On Mon, Dec 16, 2013 at 5:57 AM, Tim Kane <tim(dot)kane(at)gmail(dot)com> wrote:
> Hi all,
>
> The past few days I’ve been encountering the following error, followed by a
> full db restart and recovery
>
>
> 2013-12-16 07:12:53 GMT LOG: could not write temporary statistics file
> "pg_stat_tmp/pgstat.tmp": No space left on device
Is that the only thing in the logs? pg_stat_tmp problems should not bring
down your database. But problems with pg_xlog running out of space
certainly can--but they should also be logged.
>
> This occurs at a time of moderate load, during the same set of operations each
> morning.
> Interestingly, when I execute this manually at any other time of date, the
> process completes normally.
>
> I presume that the pg_stat_tmp location is system-wide and likely is not
> impacted by temp_tablespaces
> The root partition, where postgresql is installed does not have a lot of disk
> available (4GB).
>
> My first instinct here is to symlink pg_stat_tmp against another disk with a
> little more room to breathe, however I’m surprised that pgstat.tmp would grow
> to be so large in the first place – possibly there is something else at play
> here.
We don't know how large it is getting! If pg_stat_tmp shares the same
partition as pg_xlog, base (as in the default configuration), and pg_log,
then any of those things could be filling up the partition, and pg_stat_tmp
could just be the canary, not the culprit.
Anyway, you don't need to use a symlink, you could just change
stats_temp_directory to point someplace else.
Your best bet is run "du" or something similar to figure out where your
space is actually going.
Cheers,
Jeff
From | Date | Subject | |
---|---|---|---|
Next Message | Marcin Krawczyk | 2013-12-17 12:06:38 | Re: [ADMIN] Scheduled Events |
Previous Message | Stefan Schwarzer | 2013-12-17 09:27:19 | Re: How to rename the same column name in dozens of tables in Postgres? |