From: | Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com> |
---|---|
To: | Ashutosh Bapat <ashutosh(dot)bapat(at)enterprisedb(dot)com> |
Cc: | Peter Geoghegan <pg(at)heroku(dot)com>, Haribabu Kommi <kommi(dot)haribabu(at)gmail(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: WIP: [[Parallel] Shared] Hash |
Date: | 2017-02-15 23:22:47 |
Message-ID: | CAEepm=0fStoz2Bgu=zD6efXDYt17cNo-a5a6tWwooh=xrBvZRw@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Out of archeological curiosity, I was digging around in the hash join
code and RCS history from Postgres 4.2[1], and I was astounded to
discover that it had a parallel executor for Sequent SMP systems and
was capable of parallel hash joins as of 1991. At first glance, it
seems to follow approximately the same design as I propose: share a
hash table and use a barrier to coordinate the switch from build phase
to probe phase and deal with later patches. It uses mmap to get space
and then works with relative pointers. See
src/backend/executor/n_hash.c and src/backend/executor/n_hashjoin.c.
Some of this might be described in Wei Hong's PhD thesis[2] which I
haven't had the pleasure of reading yet.
The parallel support is absent from the first commit in our repo
(1996), but there are some vestiges like RelativeAddr and ABSADDR used
to access the hash table (presumably needlessly) and also some
mentions of parallel machines in comments that survived up until
commit 26069a58 (1999).
[1] http://db.cs.berkeley.edu/postgres.html
[2] http://db.cs.berkeley.edu/papers/ERL-M93-28.pdf
--
Thomas Munro
http://www.enterprisedb.com
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2017-02-15 23:30:30 | Re: bytea_output vs make installcheck |
Previous Message | Jeff Janes | 2017-02-15 23:05:28 | Re: gitlab post-mortem: pg_basebackup waiting for checkpoint |