Quick Links

Re: Disk-based hash aggregate's cost model

From:	Jeff Davis <pgsql(at)j-davis(dot)com>
To:	Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
Cc:	Peter Geoghegan <pg(at)bowt(dot)ie>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject:	Re: Disk-based hash aggregate's cost model
Date:	2020-09-04 19:33:24
Message-ID:	d518e363381470aa4b8595281a8384017ddf80da.camel@j-davis.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

On Fri, 2020-09-04 at 21:01 +0200, Tomas Vondra wrote:
> Wouldn't it be enough to just use a slot with smaller tuple
> descriptor?
> All we'd need to do is creating the descriptor in ExecInitAgg after
> calling find_hash_columns, and using it for rslot/wslot, and then
> "mapping" the attributes in hashagg_spill_tuple (which already almost
> does that, to the extra cost should be 0) and when reading the
> spilled
> tuples.

That's a good point, it's probably not much code to make it work.

> So I'm not quite buying the argument that this would make
> measurable difference ...

I meant "projection of all input tuples" (i.e. CP_SMALL_TLIST) has a
cost. If we project only at spill time, it should be fine.

Regards,
Jeff Davis

In response to

Re: Disk-based hash aggregate's cost model at 2020-09-04 19:01:37 from Tomas Vondra

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Alvaro Herrera	2020-09-04 19:37:24	Re: A micro-optimisation for walkdir()
Previous Message	Andres Freund	2020-09-04 19:11:31	Re: Improving connection scalability: GetSnapshotData()