From: | Jeremy Schneider <schneider(at)ardentperf(dot)com> |
---|---|
To: | pgsql-general(at)lists(dot)postgresql(dot)org |
Subject: | Re: typical active table count? |
Date: | 2023-06-27 22:00:35 |
Message-ID: | 03a01fb1-e0be-24f2-eadb-8e1a4b4981bf@ardentperf.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
On 6/27/23 12:08 PM, Ron wrote:
> On 6/27/23 13:47, Jeremy Schneider wrote:
>> On 6/27/23 9:32 AM, Ben Chobot wrote:
>>> We certainly have databases where far more than 100 tables are updated
>>> within a 10 second period. Is there a specific concern you have?
>>>
>> Thank Ben, not a concern but I'm trying to better understand how common
>> this might be. And I think sharing general statistics about how people
>> use PostgreSQL is a great help to the developers who build and maintain it.
>>
>> One really nice thing about PostgreSQL is that two quick copies of
>> pg_stat_all_tables and you can easily see this sort of info.
>>
>> If you have a database where more than 100 tables are updated within a
>> 10 second period - this seems really uncommon to me - I'm very curious
>> about the workload.
>
> 100 tables updates just means /possibly complicated schema/, not
> necessarily high volume.
>
> ...
>
> And honestly, 100 tables in 10 seconds is 10 tables/second. If each
> gets one insert, that's a laughably slow transaction rate. (Unless of
> course there's 85 indices per table, and foreign keys don't have
> supporting indices.)
I don't think the math actually works this way on highly concurrent
systems. In fact, this morning I connected with a coworker who works on
Amazon fulfillment center DBs and there was almost no difference in the
number of tables with insert/update/delete regardless of whether you
looked at a 10 second window or a 2 second window. I was also able to
chat with another coworker at Amazon who got numbers from a couple of
their PG databases, and connected w one person on slack at a different
company who passed along numbers, and got a few emails from Oracle folks.
The numbers reported back to me ranged from 29 to over a hundred.
Obviously there are also lots of small databases behind wordpress
websites with much less activity, but I found this to be an interesting
measure of some respectably busy systems.
The original context was a conversation related to logical replication
of DB changes.
But then I got interested in the general question and topic - and
someone on the Oracle side mentioned system tables which is a really
good point that hadn't occurred to me yet. The original conversation was
concerned with user tables and not system ones, but there would be a
fair amount of ongoing system table activity too.
Besides partitioning, another interesting dimension of the conversation
has been thinking about different categories of workloads. For example:
SaaS or multitenant applications with many copies of a similar schema,
ISVs, ERPs, or large enterprise databases with lots of development
history. All of these categories can easily ramp up the counts.
I'm still interested in more data - if anyone reading this can grab a
couple snapshots of pg_stat_all_tables and report back numbers for a 10
second window and a 2 second window, that would be amazing!
-Jeremy
From | Date | Subject | |
---|---|---|---|
Next Message | Atul Kumar | 2023-06-27 23:14:32 | connect postgres using url |
Previous Message | Marc Millas | 2023-06-27 20:58:21 | Re: pb with join plan |