From: | Magnus Hagander <magnus(at)hagander(dot)net> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | Robert Haas <robertmhaas(at)gmail(dot)com>, Joel Jacobson <joel(at)trustly(dot)com>, Jim Nasby <Jim(dot)Nasby(at)bluetreble(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: pg_stat_*_columns? |
Date: | 2015-06-20 15:15:11 |
Message-ID: | CABUevEz6sjx254hPZV_AX=sxodtoeVZHmE4CjuCMunTFgcrv=g@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Sat, Jun 20, 2015 at 10:55 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Robert Haas <robertmhaas(at)gmail(dot)com> writes:
> > On Sat, Jun 20, 2015 at 10:12 AM, Joel Jacobson <joel(at)trustly(dot)com>
> wrote:
> >> I guess it
> >> primarily depends on how much of the new code that would need to be
> >> rewritten, if the collector is optimized/rewritten in the future?
>
> > I don't think that's really the issue. It's more that I think it
> > would be the extra data would be likely to cause real pain for users.
>
> Yes. The stats data already causes real pain.
>
> > FWIW, I tend to think that the solution here has less to do with
> > splitting the data up into more files and more to do with rethinking
> > the format.
>
> I dunno that tweaking the format would accomplish much. Where I'd love
> to get to is to not have to write the data to disk at all (except at
> shutdown). But that seems to require an adjustable-size shared memory
> block, and I'm not sure how to do that. One idea, if the DSM stuff
> could be used, is to allow the stats collector to allocate multiple
> DSM blocks as needed --- but how well would that work on 32-bit
> machines? I'd be worried about running out of address space.
>
I've considered both that and to perhaps use a shared memory message queue
to communicate. Basically, have a backend send a request when it needs a
snapshot of the stats data and get a copy back through that method instead
of disk. It would be much easier if we didn't actually take a snapshot of
the data per transaction, but we really don't want to give that up (if we
didn't care about that, we could just have a protocol asking for individual
values).
We'd need a way to actually transfer the whole hashtables over, without
rebuilding them on the other end I think. Just the cost of looping over it
to dump and then rehashing everything on the other end seems quite wasteful
and unnecessary.
--
Magnus Hagander
Me: http://www.hagander.net/
Work: http://www.redpill-linpro.com/
From | Date | Subject | |
---|---|---|---|
Next Message | Tomas Vondra | 2015-06-20 15:28:56 | Re: pretty bad n_distinct estimate, causing HashAgg OOM on TPC-H |
Previous Message | Tomas Vondra | 2015-06-20 14:56:04 | Re: pretty bad n_distinct estimate, causing HashAgg OOM on TPC-H |