From: | Noah Misch <noah(at)leadboat(dot)com> |
---|---|
To: | Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com> |
Cc: | pgsql-hackers(at)postgresql(dot)org, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com> |
Subject: | Re: Re: PATCH: Split stats file per database WAS: autovacuum stress-testing our system |
Date: | 2016-02-03 05:46:59 |
Message-ID: | 20160203054659.GA4135993@tornado.leadboat.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Mon, Feb 01, 2016 at 07:03:45PM +0100, Tomas Vondra wrote:
> On 12/22/2015 03:49 PM, Noah Misch wrote:
> >On Mon, Feb 18, 2013 at 06:19:12PM -0300, Alvaro Herrera wrote:
> >>I have pushed it now. Further testing, of course, is always welcome.
> >
> >While investigating stats.sql buildfarm failures, mostly on animals axolotl
> >and shearwater, I found that this patch (commit 187492b) inadvertently removed
> >the collector's ability to coalesce inquiries. Every PGSTAT_MTYPE_INQUIRY
> >received now causes one stats file write. Before, pgstat_recv_inquiry() did:
> >
> > if (msg->inquiry_time > last_statrequest)
> > last_statrequest = msg->inquiry_time;
> >
> >and pgstat_write_statsfile() did:
> >
> > globalStats.stats_timestamp = GetCurrentTimestamp();
> >... (work of writing a stats file) ...
> > last_statwrite = globalStats.stats_timestamp;
> > last_statrequest = last_statwrite;
> >
> >If the collector entered pgstat_write_statsfile() with more inquiries waiting
> >in its socket receive buffer, it would ignore them as being too old once it
> >finished the write and resumed message processing. Commit 187492b converted
> >last_statrequest to a "last_statrequests" list that we wipe after each write.
>
> So essentially we remove the list of requests, and thus on the next round we
> don't know the timestamp of the last request and write the file again
> unnecessarily. Do I get that right?
Essentially right. Specifically, for each database, we must remember the
globalStats.stats_timestamp of the most recent write. It could be okay to
forget the last request timestamp. (I now doubt I picked the best lines to
quote, above.)
> What if we instead kept the list but marked the requests as 'invalid' so
> that we know the timestamp? In that case we'd be able to do pretty much
> exactly what the original code did (but at per-db granularity).
The most natural translation of the old code would be to add a write_time
field to struct DBWriteRequest. One can infer "invalid" from write_time and
request_time. There are other reasonable designs, though.
> We'd have to cleanup the list once in a while not to grow excessively large,
> but something like removing entries older than PGSTAT_STAT_INTERVAL should
> be enough.
Specifically, if you assume the socket delivers messages in the order sent,
you may as well discard entries having write_time at least
PGSTAT_STAT_INTERVAL older than the most recent cutoff_time seen in a
PgStat_MsgInquiry. That delivery order assumption does not hold in general,
but I expect it's close enough for this purpose.
> >As for how to fix this, the most direct translation of the old logic is to
> >retain last_statrequests entries that could help coalesce inquiries.I lean
> >toward that for an initial, back-patched fix.
>
> That seems reasonable and I believe it's pretty much the idea I came up with
> above, right? Depending on how you define "entries that could help coalesce
> inquiries".
Yes.
From | Date | Subject | |
---|---|---|---|
Next Message | Konstantin Knizhnik | 2016-02-03 07:37:21 | Re: PATCH: index-only scans with partial indexes |
Previous Message | Amit Kapila | 2016-02-03 05:42:36 | Re: WAL Re-Writes |