From: | Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com> |
---|---|
To: | Jeff Davis <pgsql(at)j-davis(dot)com> |
Cc: | Bruce Momjian <bruce(at)momjian(dot)us>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Justin Pryzby <pryzby(at)telsasoft(dot)com>, David Rowley <dgrowleyml(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Melanie Plageman <melanieplageman(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: Default setting for enable_hashagg_disk |
Date: | 2020-06-25 23:11:54 |
Message-ID: | 20200625231154.uknvrnqwmahv6rtu@development |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-docs pgsql-hackers |
On Thu, Jun 25, 2020 at 02:28:02PM -0700, Jeff Davis wrote:
>On Thu, 2020-06-25 at 15:56 -0400, Bruce Momjian wrote:
>> It is my understanding that spill of sorts is mostly read
>> sequentially,
>> while hash reads are random. Is that right? Is that not being
>> costed
>> properly?
>
>I don't think there's a major problem with the cost model, but it could
>probably use some tweaking.
>
>Hash writes are random. The hash reads should be mostly sequential (for
>large partitions it will be 128-block extents, or 1MB). The cost model
>assumes 50% sequential and 50% random.
>
The important bit here is that while the logical writes are random,
those are effectively combined in page cache and the physical writes are
pretty sequential. So I think the cost model is fairly reasonable.
Note: Judging by iosnoop stats shared in the thread linked by Jeff.
>Sorts are written sequentially and read randomly, but there's
>prefetching to keep the reads from being too random. The cost model
>assumes 75% sequential and 25% random.
>
>Overall, the IO pattern is better for Sort, but not dramatically so.
>Tomas Vondra did some nice analysis here:
>
>
>https://www.postgresql.org/message-id/20200525021045.dilgcsmgiu4l5jpa@development
>
>That resulted in getting the prealloc and projection patches in.
>
>Regards,
> Jeff Davis
>
>
regards
--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
From | Date | Subject | |
---|---|---|---|
Next Message | Tomas Vondra | 2020-06-25 23:18:31 | Re: Default setting for enable_hashagg_disk |
Previous Message | Alvaro Herrera | 2020-06-25 22:58:53 | Re: Default setting for enable_hashagg_disk |
From | Date | Subject | |
---|---|---|---|
Next Message | Tomas Vondra | 2020-06-25 23:18:31 | Re: Default setting for enable_hashagg_disk |
Previous Message | Alvaro Herrera | 2020-06-25 22:58:53 | Re: Default setting for enable_hashagg_disk |