Re: pb with big volumes

From: Marc Millas <marc(dot)millas(at)mokadb(dot)com>
To: Ron <ronljohnsonjr(at)gmail(dot)com>
Cc: "pgsql-general(at)lists(dot)postgresql(dot)org" <pgsql-general(at)lists(dot)postgresql(dot)org>
Subject: Re: pb with big volumes
Date: 2023-08-11 10:17:26
Message-ID: CADX_1aZZ0W-UFkZxeOmKt0R4u7SYL2n8gNQeQv1ULg18zrkHPg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

No if there is a pumping effect ie. Loading data in the cache flushed by
the other request and again and again.
I have had this on an Oracle db years ago. Iostat 100% for days.
Managed by partitionning and locking some data in the cache

Le ven. 11 août 2023 à 03:54, Ron <ronljohnsonjr(at)gmail(dot)com> a écrit :

> Wouldn't IO contention make for additive timings instead of exponential?
>
> On 8/10/23 20:41, Adam Scott wrote:
>
> I think your concern is that 20 min + 30 min does not equal 3 hours.
>
> It might be natural to think the contention would, at max, be 50 min x 2
> (1 hr 40 min).
>
> So what's going on?
>
> It seems disk I/O is a primary suspect since you hint for an iostat
> replacement inside of Postgres.
>
> If it is due to disk I/O the resolution will be to add RAID 0 SSDs at best.
>
> Consider looking at io stats on the container's persistent volumes.
>
> What is the pipe connecting the database server to the disks? If it's
> NAS, well that would explain it.
>
> HTH,
> Adam
>
>
>
> On Thu, Aug 10, 2023 at 2:37 PM Marc Millas <marc(dot)millas(at)mokadb(dot)com>
> wrote:
>
>> Hi,
>>
>> I have a 15 TB db on postgres 14 (soon 15).
>> shared buffers is 32 GB.
>>
>> It's a db with max 15 users and often less, and currently 1 or 2.
>> the biggest table have 133 partitions of 150M to 200M+ rows each.
>> lots of request access explicitly one of those.
>>
>> When I, alone, run a query "reading" 15M buffers, it takes 20 minutes
>> (+-5minutes). inside the query there are 2 group by on a 200M rows
>> partition, with all the rows in each group by.
>> When a colleague run the same kind of request (not the same request, but
>> something reading roughly the same volume ) , on a different set of data,
>> his request is completed in less than half an hour.
>> If we run our requests simultaneously... my request take hours. around 3
>> hours.
>>
>> I am making a supposition that its some kind of "pumping" effect in the
>> cache.
>>
>> I cannot have access to the underlying OS. I can, for sure, do some copy
>> xx from program 'some command', but its a container with very limited
>> possibilities, not even 'ps'.
>> So I would like to monitor from inside the db (so without iostat and the
>> same) the volumes of read that postgres do to the OS.
>> I did activate track_io_timing, but the volumes I get in the explain
>> analyze buffer are roughly the same alone or not alone. (the 15M buffers
>> told )
>> to my understanding, the volumes that are shown in pg_stat_database are
>> the useful ones ie. even if the db as to read it from disk more than once.
>> true ? or false ?
>>
>> So.. either my supposition is not correct, and I will read with a lot of
>> interest other ideas
>> either its correct and I would like to know how to monitor this (in the
>> current context, installing a dedicated extension is not impossible, but is
>> a very boring process)
>>
>> Thanks for your help :-)
>>
>> regards,
>>
>> PS: I know that providing the complete data model and the exact requests
>> can be considered mandatory, but when I change the request I get the very
>> same behaviour...
>>
>>
>>
>> Marc MILLAS
>> Senior Architect
>> +33607850334
>> www.mokadb.com
>>
>>
> --
> Born in Arizona, moved to Babylonia.
>

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Ron 2023-08-11 14:27:11 Re: pb with big volumes
Previous Message David Rowley 2023-08-11 04:40:52 Re: pb with big volumes