Quick Links

Re: Disk wait problem...

From:	Adrian Klaver <adrian(dot)klaver(at)aklaver(dot)com>
To:	pf(at)pfortin(dot)com
Cc:	pgsql-general <pgsql-general(at)postgresql(dot)org>
Subject:	Re: Disk wait problem...
Date:	2023-10-23 23:31:30
Message-ID:	9938ff1a-8837-454b-a50c-a748e84c172a@aklaver.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-general

On 10/23/23 14:55, pf(at)pfortin(dot)com wrote:
Please reply to list also.
Ccing the list for this post.

> On Mon, 23 Oct 2023 12:44:56 -0700 Adrian Klaver wrote:
>
>> On 10/23/23 11:54, pf(at)pfortin(dot)com wrote:
>>> Hi,
>>>
>>> I have a 1.6TB database with over 330 tables on a 4TB NVMe SSD. All
>>> tables are static (no updates); most in 8M and 33M row sizes. Queries have
>>> been great, until recently.
>>
>>> Also attached is the relevant system journal entries for one query that
>>> took 2 seconds and two more that took 62 and 52 seconds... Unlike the
>>> above, these had 3 processes in disk-wait -- see attached screenshot...
>>
>> What happens if you select the tables in the command line client psql?
>
> Hi Adrian,
>
> Sorry about everything inline; I think it saves time vs dealing with
> attachments...
>
> Summary:
> * single large partition dedicated to PG
> * only PG seems affected
> * buffered reads are 6x slower when PG queries are in progress
> * psql queries of 30 33M tables shows random results up to 4m51s
> * successful psql queries below 500ms
> * ls -l loop on another directory had no delays.
>
> HTH...
>
> The WB query was in disk-wait, so psql gave:
> ncsbe=# select count(*) from ncvhis_2022_10_29;
> FATAL: terminating connection due to administrator command
> server closed the connection unexpectedly
> This probably means the server terminated abnormally
> before or while processing the request.
> The connection to the server was lost. Attempting reset: Succeeded.
>
> The above query finished a second or two before the WB results
> appeared; then this next query succeeded right after:
> ncsbe=# select count(*) from
> ncvhis_2022_10_29; count
> ----------
> 33861176
> (1 row)
>
> Another table went into disk-wait long enough that I was able to query
> the tmp directory on the same disk just fine.
>
> Querying the same table; no error, canceled it:
> ncsbe=# select count(*) from ncvhis_2022_07_23;
> ^CCancel request sent
> ERROR: canceling statement due to user request
>
> While this table was still in disk-wait, tried another table which
> returned quickly:
> ncsbe=# select count(*) from ncvhis_2023_10_21;
> count
> ----------
> 32983343
> (1 row)
>
> Eventually, the WB query did complete...
>
> Next, I made a query list for 30 tables, turned on timing and pasted the
> queries. Pasting results inline:
> ncsbe=# select count(*) from ncvhis_2012_12_06;
> select count(*) from ncvhis_2016_12_03;
> select count(*) from ncvhis_2020_08_01;
> select count(*) from ncvhis_2020_10_31;
> select count(*) from ncvhis_2020_12_05;
> select count(*) from ncvhis_2020_12_26;
> select count(*) from ncvhis_2021_03_06;
> select count(*) from ncvhis_2021_06_12;
> select count(*) from ncvhis_2022_07_23;
> select count(*) from ncvhis_2022_10_15;
> select count(*) from ncvhis_2022_10_22;
> select count(*) from ncvhis_2022_10_29;
> select count(*) from ncvhis_2022_11_05;
> select count(*) from ncvhis_2022_11_12;
> select count(*) from ncvhis_2022_11_19;
> select count(*) from ncvhis_2022_11_26;
> select count(*) from ncvhis_2022_12_03;
> select count(*) from ncvhis_2022_12_10;
> select count(*) from ncvhis_2022_12_17;
> select count(*) from ncvhis_2022_12_24;
> select count(*) from ncvhis_2022_12_31;
> select count(*) from ncvhis_2023_01_08;
> select count(*) from ncvhis_2023_01_14;
> select count(*) from ncvhis_2023_01_21;
> select count(*) from ncvhis_2023_01_28;
> select count(*) from ncvhis_2023_02_04;
> select count(*) from ncvhis_2023_02_11;
> select count(*) from ncvhis_2023_02_18;
> select count(*) from ncvhis_2023_02_27;
> select count(*) from ncvhis_2023_03_04;
> count
> ----------
> 31923950
> (1 row)
>
> Time: 72404.786 ms (01:12.405)
> count
> ----------
> 29701168
> (1 row)
>
> Time: 301.246 ms
> count
> ----------
> 32172845
> (1 row)
>
> Time: 409.974 ms
> count
> ----------
> 32162748
> (1 row)
>
> Time: 363.836 ms
> count
> ----------
> 34679651
> (1 row)
>
> Time: 351.167 ms
> count
> ----------
> 35006711
> (1 row)
>
> Time: 348.378 ms
> count
> ----------
> 35003995
> (1 row)
>
> Time: 348.712 ms
> count
> ----------
> 34994502
> (1 row)
>
> Time: 351.901 ms
> count
> ----------
> 33686292
> (1 row)
>
> Time: 487.837 ms
> count
> ----------
> 33861658
> (1 row)
>
> Time: 40987.826 ms (00:40.988)
> count
> ----------
> 33861381
> (1 row)
>
> Time: 76964.281 ms (01:16.964)
> count
> ----------
> 33861176
> (1 row)
>
> Time: 483.329 ms
> count
> ----------
> 33861071
> (1 row)
>
> Time: 18919.267 ms (00:18.919)
> count
> ----------
> 29344354
> (1 row)
>
> Time: 50896.978 ms (00:50.897)
> count
> ----------
> 30709227
> (1 row)
>
> Time: 25784.000 ms (00:25.784)
> count
> ----------
> 32368001
> (1 row)
>
> Time: 45407.599 ms (00:45.408)
> count
> ----------
> 33126826
> (1 row)
>
> Time: 75171.609 ms (01:15.172)
> count
> ----------
> 33126808
> (1 row)
>
> Time: 88871.004 ms (01:28.871)
> count
> ----------
> 33127317
> (1 row)
>
> Time: 128565.127 ms (02:08.565)
> count
> ----------
> 33127192
> (1 row)
>
> Time: 210248.222 ms (03:30.248)
> count
> ----------
> 33127117
> (1 row)
>
> Time: 246609.561 ms (04:06.610) 4 minutes! ;p
> count
> ----------
> 33124457
> (1 row)
>
> Time: 166406.283 ms (02:46.406)
> count
> ----------
> 33123714
> (1 row)
>
> Time: 291483.538 ms (04:51.484) nearly 5 minutes ;p ;p
> count
> ----------
> 33123313
> (1 row)
>
> Time: 2269.961 ms (00:02.270)
> count
> ----------
> 33123197
> (1 row)
>
> Time: 413.219 ms
> count
> ----------
> 33123091
> (1 row)
>
> Time: 433.817 ms
> count
> ----------
> 33122953
> (1 row)
>
> Time: 395.844 ms
> count
> ----------
> 33122752
> (1 row)
>
> Time: 7251.637 ms (00:07.252)
> count
> ----------
> 33122665
> (1 row)
>
> Time: 384.328 ms
> count
> ----------
> 33122573
> (1 row)
>
> Time: 384.887 ms
> ncsbe=#
>
> I don't see a pattern in the above; matches the randomness I saw using
> WB...
>
> Before the above finished, I issued this command on another konsole...
>
> $ while true; do ls -l > /tmp/ll; date; done
> Mon Oct 23 05:12:54 PM EDT 2023
> Mon Oct 23 05:12:54 PM EDT 2023
> Mon Oct 23 05:12:54 PM EDT 2023
> Mon Oct 23 05:12:54 PM EDT 2023
> Mon Oct 23 05:12:54 PM EDT 2023
> Mon Oct 23 05:12:54 PM EDT 2023
> Mon Oct 23 05:12:54 PM EDT 2023
> Mon Oct 23 05:12:54 PM EDT 2023
> Mon Oct 23 05:12:54 PM EDT 2023
> Mon Oct 23 05:12:54 PM EDT 2023
> Mon Oct 23 05:12:54 PM EDT 2023
> Mon Oct 23 05:12:54 PM EDT 2023
> ...
> Lots of high speed output with no pauses, at 64 loops per second.
>
> So that means only PG is getting disk waits.
>
> This disk is dedicated to PG and a rarely used semi-permanent tmp
> directory on a single partition.
>
> This is from fdisk -l:
> Disk /dev/nvme1n1: 3.64 TiB, 4000787030016 bytes, 7814037168 sectors
> Disk model: Seagate FireCuda 530 ZP4000GM30023
> Units: sectors of 1 * 512 = 512 bytes
> Sector size (logical/physical): 512 bytes / 512 bytes
> I/O size (minimum/optimal): 512 bytes / 512 bytes
> Disklabel type: gpt
> Disk identifier: 9B3C5D73-6FFD-475D-9E1E-26F43B41C690
>
> Device Start End Sectors Size Type
> /dev/nvme1n1p1 2048 7814035455 7814033408 3.6T Linux filesystem
>
> $ hdparm -Tt /dev/nvme1n1
> /dev/nvme1n1:
> Timing cached reads: 36820 MB in 2.00 seconds = 18434.72 MB/sec
> Timing buffered disk reads: 774 MB in 3.00 seconds = 257.74 MB/sec
>
> The 30 queries still hadn't finished at this point...
>
> Interesting: the hdparm buffered read is very slow; once the queries
> finished, the disk performance is back:
> $ hdparm -Tt /dev/nvme1n1
>
> /dev/nvme1n1:
> Timing cached reads: 37382 MB in 2.00 seconds = 18715.85 MB/sec
> Timing buffered disk reads: 5002 MB in 3.17 seconds = 1575.89 MB/sec
>
> Over 6x faster...
>
> So far, the only disk waits I see are PG related. Do have any opinion on
> Ken's comment about parallel queries?
>
> Thanks,
> Pierre
>
>
>>>
>>> I just stopped PG, updated the full backup which was done overnight (so
>>> the update took about a second), restarted PG and the problem persists...
>>>
>>> Any suggestions where to look next?
>>>
>>> Thanks,
>>> Pierre
>>

--
Adrian Klaver
adrian(dot)klaver(at)aklaver(dot)com

In response to

Re: Disk wait problem... at 2023-10-23 19:44:56 from Adrian Klaver

Responses

Re: Disk wait problem... at 2023-10-24 00:03:16 from pf
Re: Disk wait problem... at 2023-10-26 09:50:00 from Peter J. Holzer

Browse pgsql-general by date

	From	Date	Subject
Next Message	pf	2023-10-24 00:03:16	Re: Disk wait problem...
Previous Message	Brad White	2023-10-23 23:16:56	setting up streaming replication