From: | Thomas Munro <thomas(dot)munro(at)gmail(dot)com> |
---|---|
To: | James Coleman <jtc331(at)gmail(dot)com> |
Cc: | Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, PostgreSQL mailing lists <pgsql-bugs(at)lists(dot)postgresql(dot)org> |
Subject: | Re: BUG #16104: Invalid DSA Memory Alloc Request in Parallel Hash |
Date: | 2019-11-10 22:07:01 |
Message-ID: | CA+hUKGKwChtb-PpT-WOhG9fn0nAcvb_Xw333ws4-QTWnhcti_g@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-bugs |
On Mon, Nov 11, 2019 at 10:38 AM James Coleman <jtc331(at)gmail(dot)com> wrote:
> But now we're at the end of my understanding of how hash tables and
> joins are implemented in PG; is there a wiki page or design that might
> give me some current design description of how the buckets and batches
> work with the hash so I can keep following along?
We have something like the so-called "Grace" hash join (with the
"hybrid" refinement, irrelevant for this discussion):
https://en.wikipedia.org/wiki/Hash_join#Grace_hash_join
Our word "batch" just means partition. Most descriptions talk about
using two different hash functions for partition and bucket, but our
implementation uses a single hash function, and takes some of the bits
to choose the bucket and some of the bits to choose the batch. That
happens here:
https://github.com/postgres/postgres/blob/master/src/backend/executor/nodeHash.c#L1872
From | Date | Subject | |
---|---|---|---|
Next Message | Andres Freund | 2019-11-10 22:46:31 | Re: BUG #16104: Invalid DSA Memory Alloc Request in Parallel Hash |
Previous Message | Tomas Vondra | 2019-11-10 22:02:59 | Re: BUG #16104: Invalid DSA Memory Alloc Request in Parallel Hash |