Quick Links

Re: Avoiding hash join batch explosions with extreme skew and weird stats

From:	Robert Haas <robertmhaas(at)gmail(dot)com>
To:	Melanie Plageman <melanieplageman(at)gmail(dot)com>
Cc:	Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Avoiding hash join batch explosions with extreme skew and weird stats
Date:	2019-06-04 19:15:47
Message-ID:	CA+Tgmoa1DuROuD5ngwDLHqs-TrwNrkuVxE5nC+5dFqcm_HQ18g@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

On Tue, Jun 4, 2019 at 3:09 PM Melanie Plageman
<melanieplageman(at)gmail(dot)com> wrote:
> One question I have is, how would the OR'd together bitmap be
> propagated to workers after the first chunk? That is, when there are
> no tuples left in the outer bunch, for a given inner chunk, would you
> load the bitmaps from each worker into memory, OR them together, and
> then write the updated bitmap back out so that each worker starts with
> the updated bitmap?

I was assuming we'd elect one participant to go read all the bitmaps,
OR them together, and generate all the required null-extended tuples,
sort of like the PHJ_BUILD_ALLOCATING, PHJ_GROW_BATCHES_ALLOCATING,
PHJ_GROW_BUCKETS_ALLOCATING, and/or PHJ_BATCH_ALLOCATING states only
involve one participant being active at a time. Now you could hope for
something better -- why not parallelize that work? But on the other
hand, why not start simple and worry about that in some future patch
instead of right away? A committed patch that does something good is
better than an uncommitted patch that does something AWESOME.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Re: Avoiding hash join batch explosions with extreme skew and weird stats at 2019-06-04 19:08:57 from Melanie Plageman

Responses

Re: Avoiding hash join batch explosions with extreme skew and weird stats at 2019-06-06 23:33:31 from Melanie Plageman

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Dave Cramer	2019-06-04 19:47:04	Re: Binary support for pgoutput plugin
Previous Message	Melanie Plageman	2019-06-04 19:08:57	Re: Avoiding hash join batch explosions with extreme skew and weird stats