Re: Increased iowait and blk_read_time with higher shared_buffers

From: Jordan Hurwich <jhurwich(at)pulsasensors(dot)com>
To: Samed YILDIRIM <samed(at)reddoc(dot)net>
Cc: pgsql-performance(at)lists(dot)postgresql(dot)org, Gautam Bellary <gautam(at)pulsasensors(dot)com>
Subject: Re: Increased iowait and blk_read_time with higher shared_buffers
Date: 2022-12-14 18:43:45
Message-ID: CAJKqsjsesEzBwZfqo4K_QGDhQ0Bt429TO=uUzWjyQbHAJYrmHg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Awesome, this is really helpful Samed. I'll start experimenting with these
settings next. Really appreciate your guidance.

On Wed, Dec 14, 2022 at 10:41 AM Samed YILDIRIM <samed(at)reddoc(dot)net> wrote:

> Hello Jordan,
>
> You don't have to set %25 for the best performance. You need to test
> different values for your database. If I were you, I would
>
> - try to enable huge pages. You probably will see better performance
> with bigger shared_buffers when you configure huge pages. ->
> https://www.postgresql.org/docs/14/kernel-resources.html#LINUX-HUGE-PAGES
> - set effective_io_concurrency to 200. But, you need to test to figure
> out the best value. It significantly depends on your disk's
> metrics/configuration
> - set random_page_cost to 2 and try to decrease it gradually until 1.2.
> - set effective_cache_size to 24GB
> - run pg_test_timing on the server to see the cost of asking time to
> the system. Because track_io_timing is enabled in your configuration file.
> If it is expensive, I would disable tracking io timing.
>
>
> Note that I assumed that those resources/servers are reserved for
> PostgreSQL and there is no other service running on them.
>
> Best regards.
> Samed YILDIRIM
>
>
> On Wed, 14 Dec 2022 at 20:12, Jordan Hurwich <jhurwich(at)pulsasensors(dot)com>
> wrote:
>
>> Thanks for your thoughtful response Samed.
>>
>> I'm familiar with the article you linked to, and part of my surprise is
>> that with these 32GB RAM machines we're seeing better performance at 12.5%
>> (4GB) than the commonly recommended 25% (8GB) of system memory for
>> shared_buffers. Your notes about disk read stats from Postgres potentially
>> actually representing blocks read from the OS cache make sense, I just
>> imagined that Postgres would be better at managing the memory when it was
>> dedicated to it via shared_buffers than the OS (obviously with some point
>> of diminishing returns); and I'm still hoping there's some Postgres
>> configuration change we can make that enables better performance through
>> improved utilization of shared_buffers at the commonly recommended 25% of
>> system memory.
>>
>> You mentioned effective_cache_size, which we currently have set to 16GB
>> (50% of system memory). Is it worth us experimenting with that value, if so
>> would you recommend we try reducing it or increasing it? Are there other
>> settings that we might consider to see if we can improve the utilization of
>> shared_buffers at higher values like 8GB (25% of system memory)?
>>
>> On Wed, Dec 14, 2022 at 4:38 AM Samed YILDIRIM <samed(at)reddoc(dot)net> wrote:
>>
>>> Hi Jordan,
>>>
>>> Increased shared buffer size does not necessarily mean an increased
>>> performance.
>>>
>>> Regarding the negative correlation between IOWait and shared_buffers'
>>> size; if you don't increase memory of the system, it is an expected result
>>> in my opinion. Because, PostgreSQL starts reserving a bigger portion of the
>>> system memory, and the OS cache size decreases respectively. Smaller OS
>>> cache can easily result with more disk access and higher IO demand and
>>> bigger IOWait.
>>>
>>> As you can see in graphs, when you increase the size of shared_buffers,
>>> you see higher block hits and lower block reads. "hits" refers to the
>>> blocks that are already in shared_buffers. "reads" refers to the blocks
>>> that are not in shared_buffers and *"read from* *disk"*. But, *"read
>>> from disk"* that you see in PostgreSQL's statistic catalogs doesn't
>>> mean all of those blocks were read from the disk. PostgreSQL requests data
>>> blocks, which are not already in shared_buffers, from the kernel. And, if
>>> the requested block is in the OS cache, the kernel provides it directly
>>> from the memory. No disk access, therefore, happens at all. And, you
>>> observe that through lower disk access (I/O) and lower IOWait on your
>>> operating system.
>>>
>>> When you increase size of shared_buffers without increasing amount of
>>> the system memory and with or without decreasing effective_cache_size,
>>> PostgreSQL considers the possibility of the block to be requested on the
>>> memory lower than previous configuration. So, it creates execution plans
>>> with less index usages. Less index usage means more sequential scan. More
>>> sequential scan means more disk read. We already have less OS cache. And
>>> the system has to carry out more disk accesses.
>>>
>>> As you can see, they are all connected. Setting shared_buffers higher
>>> than a threshold, which varies from database to database, actually
>>> decreases your performance.
>>>
>>> To conclude, your results are expected results.
>>>
>>> A useful resource to read:
>>> https://wiki.postgresql.org/wiki/Tuning_Your_PostgreSQL_Server
>>>
>>>> ..... given the way PostgreSQL also relies on the operating system
>>>> cache, it's unlikely you'll find using more than 40% of RAM to work better
>>>> than a smaller amount.
>>>>
>>>
>>> Best regards.
>>> Samed YILDIRIM
>>>
>>>
>>> On Tue, 13 Dec 2022 at 02:29, Jordan Hurwich <jhurwich(at)pulsasensors(dot)com>
>>> wrote:
>>>
>>>> Hi everyone,
>>>> I'm writing to ask about a correlation I was surprised to observe on
>>>> our PSQL machines (particularly read-only standbys) where increasing
>>>> "shared_buffers" appears to result in
>>>> increased pg_stat_database.blk_read_time and CPU iowait, which in turn
>>>> seems to correlate with reduced throughput for our query-heavy services -
>>>> details below.
>>>>
>>>> Is this expected, or are there configuration changes we might make to
>>>> improve the performance at higher "shared_buffers" values?
>>>>
>>>> Thanks, let me know if I can provide any more info,
>>>> Jordan
>>>>
>>>> - Tests and results - public Datadog dashboard here
>>>> <https://p.datadoghq.com/sb/0d34b3451-8bde042f82c012981b94796cdc26e259>,
>>>> screenshot attached:
>>>> - Our beta system ("endor") was run with three different
>>>> configurations over the ~30hrs from Dec 11 17:00 to Dec 13 0:00 (UTC)
>>>> - The only changes between these deployments was the
>>>> "shared_buffers" parameter for all PSQL instances (machine and
>>>> configuration details below).
>>>> - "shared_buffers" = "4000MB" - from Dec 10 19:00 to Dec 11
>>>> 20:00 UTC
>>>> - "shared_buffers" = "8000MB" - from Dec 11 21:00 to Dec 12
>>>> 13:30 UTC
>>>> - "shared_buffers" = "14000MB" - from Dec 12, 14:30 to Dec
>>>> 13, 0:00 UTC
>>>> - The datadog dashboard
>>>> <https://p.datadoghq.com/sb/0d34b3451-8bde042f82c012981b94796cdc26e259>
>>>> shows our results including cpu divided by usage and the cache hit vs disk
>>>> read ratio including blk_read_time (additional metrics were enabled at
>>>> about Dec 11, 10am PST)
>>>> - Our most query heavy service is our "Trends worker" for
>>>> which the average worker duration is shown in the top-left graph
>>>> - We expect the workload to be relatively constant
>>>> throughout this period, particularly focusing on the standby instances
>>>> (PQSL2 and PSQL3) where all read-only queries should be sent.
>>>> - We see the lowest duration, i.e. best performance, most
>>>> consistently with the lowest setting for shared_buffers, "4000MB"
>>>> - As we increase shared_buffers we see increased iowait on
>>>> the standby instances (PSQL2 and PSQL3) and increased blk_read_time
>>>> (per pg_stat_database), in the bottom-most graphs as "blks_read_time".
>>>> - Even though we also see a higher ratio of cache hits on
>>>> those instances. Our graphs show the per second change
>>>> in pg_stat_database.blks_read abd blks_hit (as "all_hit/s" and
>>>> "all_read/s") and pg_statio_user_tables.heap_blks_read, heap_blks_hit,
>>>> idx_blks_read, and idx_blks_hit
>>>> - Cluster contains 3 PSQL nodes, all on AWS EC2 instances,
>>>> postgresql.conf attached
>>>> - Version: psql (PostgreSQL) 14.1
>>>> - Machine:
>>>> - AWS "c6gd.4xlarge" (32GB RAM, 16 core 2.5 GHz, local
>>>> storage 950 GB ssd)
>>>> - uname -a: Linux ip-172-30-64-110 5.4.0-1038-aws #40-Ubuntu
>>>> SMP Fri Feb 5 23:53:34 UTC 2021 aarch64 aarch64 aarch64 GNU/Linux
>>>> - Replication via WAL:
>>>> - Line configuration: PSQL1 (master), PSQL1 followed by
>>>> PSQL2, PSQL2 followed by PSQL3
>>>> - Managed by repmgr (version: repmgr 5.3.0), no failovers
>>>> observed during timeframe of interest
>>>> - Load balancing:
>>>> - Managed by PGPool-II (version: 4.3.2 (tamahomeboshi)) on 3
>>>> AWS instances
>>>> - All write queries go to master. All read-only queries go to
>>>> standbys unless WAL on standby > 10MB, falling back to read from master as
>>>> last resort
>>>>
>>>>

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message Rick Otten 2022-12-14 20:03:42 creating hash indexes
Previous Message Samed YILDIRIM 2022-12-14 18:41:27 Re: Increased iowait and blk_read_time with higher shared_buffers