Re: Auto vacuum not running -- Could not bind socket for statistics collector

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Tim Schäfer <ts+ml(at)rcmd(dot)org>
Cc: pgsql-general <pgsql-general(at)postgresql(dot)org>
Subject: Re: Auto vacuum not running -- Could not bind socket for statistics collector
Date: 2014-12-03 16:04:04
Message-ID: 16641.1417622644@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

=?UTF-8?Q?Tim_Sch=C3=A4fer?= <ts+ml(at)rcmd(dot)org> writes:
>> On December 2, 2014 at 4:51 PM Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> Yes, this will break autovacuum, because it won't have any way to find out
>> what it should vacuum. The cause probably is a DNS issue: "localhost"
>> isn't resolving to anything sensible. "dig localhost" on the command line
>> might offer some insight.

> thanks for your answer. Here is my full 'dig localhost' from the database
> server:
> ...
> Looks fine to me. Or is there something wrong with it?

Hmph, looks fine to me too.

> And are you sure pgsql is unhappy with localhost? It would be great if I
> definitely knew the address it is trying to bind. Is there a way to tell?

As I mentioned, this has nothing to do with local_addresses or the port
setting; the code in pgstat.c is hard-wired to bind to whatever
"localhost" resolves as.

One idea is to see if you can strace postmaster startup (or whatever your
preferred local equivalent of strace is, perhaps truss). That will
produce a great deal of output but there should only be a few bind()
calls so it won't be too hard to find the section where this is happening.

Hmm ... actually, when I try it here (on a RHEL6 machine) the relevant
stretch of output is

open("/etc/hosts", O_RDONLY|O_CLOEXEC) = 8
fcntl(8, F_GETFD) = 0x1 (flags FD_CLOEXEC)
fstat(8, {st_mode=S_IFREG|0644, st_size=158, ...}) = 0
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7fa42206a000
read(8, "127.0.0.1 localhost localhost."..., 4096) = 158
read(8, "", 4096) = 0
close(8) = 0
munmap(0x7fa42206a000, 4096) = 0
socket(PF_INET, SOCK_DGRAM, IPPROTO_IP) = 8
connect(8, {sa_family=AF_INET, sin_port=htons(0), sin_addr=inet_addr("127.0.0.1")}, 16) = 0
getsockname(8, {sa_family=AF_INET, sin_port=htons(37185), sin_addr=inet_addr("127.0.0.1")}, [16]) = 0
close(8) = 0
socket(PF_INET6, SOCK_DGRAM, IPPROTO_IP) = 8
connect(8, {sa_family=AF_INET6, sin6_port=htons(0), inet_pton(AF_INET6, "::1", &sin6_addr), sin6_flowinfo=0, sin6_scope_id=0}, 28) = 0
getsockname(8, {sa_family=AF_INET6, sin6_port=htons(39774), inet_pton(AF_INET6, "::1", &sin6_addr), sin6_flowinfo=0, sin6_scope_id=0}, [28]) = 0
close(8) = 0
socket(PF_INET6, SOCK_DGRAM, IPPROTO_IP) = 8
bind(8, {sa_family=AF_INET6, sin6_port=htons(0), inet_pton(AF_INET6, "::1", &sin6_addr), sin6_flowinfo=0, sin6_scope_id=0}, 28) = 0
getsockname(8, {sa_family=AF_INET6, sin6_port=htons(46928), inet_pton(AF_INET6, "::1", &sin6_addr), sin6_flowinfo=0, sin6_scope_id=0}, [28]) = 0
connect(8, {sa_family=AF_INET6, sin6_port=htons(46928), inet_pton(AF_INET6, "::1", &sin6_addr), sin6_flowinfo=0, sin6_scope_id=0}, 28) = 0
sendto(8, "\307", 1, 0, NULL, 0) = 1
select(9, [8], NULL, NULL, {0, 500000}) = 1 (in [8], left {0, 499994})
recvfrom(8, "\307", 1, 0, NULL, NULL) = 1
fcntl(8, F_SETFL, O_RDONLY|O_NONBLOCK) = 0

which suggests that getaddrinfo() preferentially looks in /etc/hosts
before contacting any DNS server. So perhaps that "dig" call is not
telling you the real state of affairs, and what you need to do is
see if there's a bogus entry for localhost in /etc/hosts.

regards, tom lane

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message pinker 2014-12-03 16:14:02 Re: Merge rows based on Levenshtein distance
Previous Message pinker 2014-12-03 15:52:20 Mistake in documentation? ANALYZE on partitioned tables