From: | Robert Haas <robertmhaas(at)gmail(dot)com> |
---|---|
To: | Magnus Hagander <magnus(at)hagander(dot)net> |
Cc: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Joel Jacobson <joel(at)trustly(dot)com>, Jim Nasby <Jim(dot)Nasby(at)bluetreble(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: pg_stat_*_columns? |
Date: | 2015-06-23 01:01:07 |
Message-ID: | CA+TgmoZTeiGLSK8XHjCEazNX9CCD=WHwsA+LgngDW-O+x3fwew@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Sun, Jun 21, 2015 at 11:43 AM, Magnus Hagander <magnus(at)hagander(dot)net> wrote:
> On Sat, Jun 20, 2015 at 11:55 PM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>> On Sat, Jun 20, 2015 at 7:01 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> >> But if the structure
>> >> got too big to map (on a 32-bit system), then you'd be sort of hosed,
>> >> because there's no way to attach just part of it. That might not be
>> >> worth worrying about, but it depends on how big it's likely to get - a
>> >> 32-bit system is very likely to choke on a 1GB mapping, and maybe even
>> >> on a much smaller one.
>> >
>> > Yeah, I'm quite worried about assuming that we can map a data structure
>> > that might be of very significant size into shared memory on 32-bit
>> > machines. The address space just isn't there.
>>
>> Considering the advantages of avoiding message queues, I think we
>> should think a little bit harder about whether we can't find some way
>> to skin this cat. As I think about this a little more, I'm not sure
>> there's really a problem with one stats DSM per database. Sure, the
>> system might have 100,000 databases in some crazy pathological case,
>> but the maximum number of those that can be in use is bounded by
>> max_connections, which means the maximum number of stats file DSMs we
>> could ever need at one time is also bounded by max_connections. There
>> are a few corner cases to think about, like if the user writes a
>> client that connects to all 100,000 databases in very quick
>> succession, we've got to jettison the old DSMs fast enough to make
>> room for the new DSMs before we run out of slots, but that doesn't
>> seem like a particularly tough nut to crack. If the stats collector
>> ensures that it never attaches to more than MaxBackends stats DSMs at
>> a time, and each backend ensures that it never attaches to more than
>> one stats DSM at a time, then 2 * MaxBackends stats DSMs is always
>> enough. And that's just a matter of bumping
>> PG_DYNSHMEM_SLOTS_PER_BACKEND from 2 to 4.
>>
>> In more realistic cases, it will probably be normal for many or all
>> backends to be connected to the same database, and the number of stats
>> DSMs required will be far smaller.
>
> What about a combination in the line of something like this: stats collector
> keeps the statistics in local memory as before. But when a backend needs to
> get a snapshot of it's data, it uses a shared memory queue to request it.
> What the stats collector does in this case is allocate a new DSM, copy the
> data into that DSM, and hands the DSM over to the backend. At this point the
> stats collector can forget about it, and it's up to the backend to get rid
> of it when it's done with it.
Well, there seems to be little point in having the stats collector
forget about a DSM that it could equally well have shared with the
next guy who wants a stats snapshot for the same database. That case
is surely *plenty* common enough to be worth optimizing for.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
From | Date | Subject | |
---|---|---|---|
Next Message | Robert Haas | 2015-06-23 01:05:52 | Re: pg_stat_*_columns? |
Previous Message | Michael Paquier | 2015-06-23 00:46:00 | Re: Hash index creation warning |