From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | pgsql-hackers(at)postgreSQL(dot)org |
Subject: | last_statrequest is in the future |
Date: | 2010-03-24 15:39:14 |
Message-ID: | 22006.1269445154@sss.pgh.pa.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Well, I didn't actually think that this patch
http://archives.postgresql.org/pgsql-committers/2010-03/msg00181.php
would yield much insight, but lookee what we have here:
http://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=jaguar&dt=2010-03-24%2004:00:07
[4ba99150.5099:483] LOG: statement: VACUUM ANALYZE num_exp_add;
[4ba99145.5071:1] LOG: last_statrequest is in the future, resetting
[4ba99145.5071:2] LOG: last_statrequest is in the future, resetting
[4ba99145.5071:3] LOG: last_statrequest is in the future, resetting
[4ba99145.5071:4] LOG: last_statrequest is in the future, resetting
[4ba99145.5071:5] LOG: last_statrequest is in the future, resetting
...
[4ba99145.5071:497] LOG: last_statrequest is in the future, resetting
[4ba99145.5071:498] LOG: last_statrequest is in the future, resetting
[4ba99145.5071:499] LOG: last_statrequest is in the future, resetting
[4ba99145.5071:500] LOG: last_statrequest is in the future, resetting
[4ba99150.5099:484] WARNING: pgstat wait timeout
There are multiple occurrences of "pgstat wait timeout" in the
postmaster log (some evidently from autovacuum, because they don't show
up as regression diffs), and every one of them is associated with a
bunch of "last_statrequest is in the future" bleats.
So at least on jaguar, it seems that the reason for this behavior is
that the system clock is significantly skewed between the stats
collector process and the backends, to the point where stats updates
generated by the collector will never appear new enough to satisfy the
requesting backends. I think I'm going to go back and modify the code
to show the actual numbers involved so we can see just how bad it is ---
but the skew must be more than five seconds or we'd not be seeing this
failure. That seems to me to put it in the class of "system bug".
Should we redesign the stats signaling logic to work around this,
or just hope we can nag kernel people into fixing it?
regards, tom lane
From | Date | Subject | |
---|---|---|---|
Next Message | A. Kretschmer | 2010-03-24 17:31:59 | question (or feature-request): over ( partition by ... order by LIMIT N) |
Previous Message | Gokulakannan Somasundaram | 2010-03-24 15:34:46 | Re: Performance Improvement for Unique Indexes |