From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | Magnus Hagander <magnus(at)hagander(dot)net> |
Cc: | Robert Haas <robertmhaas(at)gmail(dot)com>, Joel Jacobson <joel(at)trustly(dot)com>, Jim Nasby <Jim(dot)Nasby(at)bluetreble(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: pg_stat_*_columns? |
Date: | 2015-06-20 15:32:55 |
Message-ID: | 45012.1434814375@sss.pgh.pa.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Magnus Hagander <magnus(at)hagander(dot)net> writes:
> On Sat, Jun 20, 2015 at 10:55 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> I dunno that tweaking the format would accomplish much. Where I'd love
>> to get to is to not have to write the data to disk at all (except at
>> shutdown). But that seems to require an adjustable-size shared memory
>> block, and I'm not sure how to do that. One idea, if the DSM stuff
>> could be used, is to allow the stats collector to allocate multiple
>> DSM blocks as needed --- but how well would that work on 32-bit
>> machines? I'd be worried about running out of address space.
> I've considered both that and to perhaps use a shared memory message queue
> to communicate. Basically, have a backend send a request when it needs a
> snapshot of the stats data and get a copy back through that method instead
> of disk. It would be much easier if we didn't actually take a snapshot of
> the data per transaction, but we really don't want to give that up (if we
> didn't care about that, we could just have a protocol asking for individual
> values).
Yeah, that might work quite nicely, and it would not require nearly as
much surgery on the existing code as mapping the stuff into
constrained-size shmem blocks would do. The point about needing a data
snapshot is a good one as well; I'm not sure how we'd preserve that
behavior if backends are accessing the collector's data structures
directly through shmem.
I wonder if we should think about replacing the IP-socket-based data
transmission protocol with a shared memory queue, as well.
> We'd need a way to actually transfer the whole hashtables over, without
> rebuilding them on the other end I think. Just the cost of looping over it
> to dump and then rehashing everything on the other end seems quite wasteful
> and unnecessary.
Meh. All of a sudden you've made it complicated and invasive again,
to get rid of a bottleneck that's not been shown to be a problem.
Let's do the simple thing first, else maybe nothing will happen at all.
regards, tom lane
From | Date | Subject | |
---|---|---|---|
Next Message | Alvaro Herrera | 2015-06-20 15:36:11 | Re: pretty bad n_distinct estimate, causing HashAgg OOM on TPC-H |
Previous Message | Feng Tian | 2015-06-20 15:29:33 | Re: pretty bad n_distinct estimate, causing HashAgg OOM on TPC-H |