Quick Links

Re: Default setting for enable_hashagg_disk

From:	Jeff Davis <pgsql(at)j-davis(dot)com>
To:	Bruce Momjian <bruce(at)momjian(dot)us>
Cc:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Justin Pryzby <pryzby(at)telsasoft(dot)com>, David Rowley <dgrowleyml(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Melanie Plageman <melanieplageman(at)gmail(dot)com>, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject:	Re: Default setting for enable_hashagg_disk
Date:	2020-06-25 21:28:02
Message-ID:	e43828bbf4c1dc81ce8a38e9b769fd7bda45d5b4.camel@j-davis.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-docs pgsql-hackers

On Thu, 2020-06-25 at 15:56 -0400, Bruce Momjian wrote:
> It is my understanding that spill of sorts is mostly read
> sequentially,
> while hash reads are random. Is that right? Is that not being
> costed
> properly?

I don't think there's a major problem with the cost model, but it could
probably use some tweaking.

Hash writes are random. The hash reads should be mostly sequential (for
large partitions it will be 128-block extents, or 1MB). The cost model
assumes 50% sequential and 50% random.

Sorts are written sequentially and read randomly, but there's
prefetching to keep the reads from being too random. The cost model
assumes 75% sequential and 25% random.

Overall, the IO pattern is better for Sort, but not dramatically so.
Tomas Vondra did some nice analysis here:

https://www.postgresql.org/message-id/20200525021045.dilgcsmgiu4l5jpa@development

That resulted in getting the prealloc and projection patches in.

Regards,
Jeff Davis

In response to

Re: Default setting for enable_hashagg_disk at 2020-06-25 19:56:07 from Bruce Momjian

Responses

Re: Default setting for enable_hashagg_disk at 2020-06-25 23:11:54 from Tomas Vondra

Browse pgsql-docs by date

	From	Date	Subject
Next Message	Bruce Momjian	2020-06-25 22:23:11	Re: create database with template doesn't copy database ACL
Previous Message	Andres Freund	2020-06-25 20:36:29	Re: Default setting for enable_hashagg_disk

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Melanie Plageman	2020-06-25 22:09:44	Re: Avoiding hash join batch explosions with extreme skew and weird stats
Previous Message	Alastair McKinley	2020-06-25 20:47:30	Re: CUBE_MAX_DIM