Re: Stats collector frozen?

From: Magnus Hagander <magnus(at)hagander(dot)net>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Alvaro Herrera <alvherre(at)commandprompt(dot)com>, Jeremy Haile <jhaile(at)fastmail(dot)fm>, pgsql-general(at)postgresql(dot)org
Subject: Re: Stats collector frozen?
Date: 2007-01-26 15:01:22
Message-ID: 20070126150122.GA32024@svr2.hagander.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Fri, Jan 26, 2007 at 09:55:39AM -0500, Tom Lane wrote:
> Magnus Hagander <magnus(at)hagander(dot)net> writes:
> > Apparantly there is a bug lurking somewhere in pgwin32_select(). Because
> > if I put a #undef select right before the select in pgstat.c, the
> > regression tests pass.
> > I guess the bug is shown because with row level stats we simply have
> > more data to process. And it appears only to happen on UDP sockets from
> > what I can tell.
>
> Hmm ... if this theory is correct, then statistics collection has
> never worked at all on Windows, at least not under more than the most
> marginal load; and hence neither has autovacuum.

We have had lots of reports of issues with the stats collector on
Windows. Some were definitly fixed by the patch by O&T, but I don't
think all.
The thing is, since it didn't give any error messages at all, most users
wouldn't notice. Other than their tables getting bloated, in which case
they would do a manual vacuum and conlcude autovacuum wasn't good
enough. Or something.

> Does that conclusion agree with reality? You'd think we'd have heard
> a whole lot of complaints about it, not just Jeremy's; and I don't
> remember it being a sore point. (But then again I just woke up.)
> What seems somewhat more likely is that we broke pgwin32_select
> recently, in which case we oughta find out why. Or else remove it
> entirely (does your patch make that possible?).

AFAIK, it only affects UDP connections, and this patch takes
pgwin32_select out of the loop for all UDP stuff.
But if we get this in, pgwin32_select is only used in the postmaster
accept-new-connections loop (from what I can tell by a quick look), so
I'd definitly want to rewrite that one as well to use a better way than
select-emulation. Then it could go away completely.

> Keep in mind also that we have seen the stats-test failure on
> non-Windows machines, so we still need to explain that ...

Yeah. But it *could* be two different stats issues lurking. Perhaps the
issue we've seen on non-windows can be fixed by the settings Alvaro had
me try (increasing autovacuum_vacuum_cost_delay or the delay in the
regression test).

//Magnus

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Jeremy Haile 2007-01-26 15:07:31 Re: Stats collector frozen?
Previous Message Tom Lane 2007-01-26 14:55:39 Re: Stats collector frozen?