Re: WIP: [[Parallel] Shared] Hash

From: Peter Geoghegan <pg(at)bowt(dot)ie>
To: Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>
Cc: Andres Freund <andres(at)anarazel(dot)de>, Rafia Sabih <rafia(dot)sabih(at)enterprisedb(dot)com>, Ashutosh Bapat <ashutosh(dot)bapat(at)enterprisedb(dot)com>, Haribabu Kommi <kommi(dot)haribabu(at)gmail(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, Robert Haas <robertmhaas(at)gmail(dot)com>
Subject: Re: WIP: [[Parallel] Shared] Hash
Date: 2017-03-26 23:12:37
Message-ID: CAH2-Wz=CH59jxL-W67gaKzCM-ao8j3QWhjdf-iWPt9QCwWJ4tQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sun, Mar 26, 2017 at 3:41 PM, Thomas Munro
<thomas(dot)munro(at)enterprisedb(dot)com> wrote:
>> 1. Segments are what buffile.c already calls the individual
>> capped-at-1GB files that it manages. They are an implementation
>> detail that is not part of buffile.c's user interface. There seems to
>> be no reason to change that.
>
> After reading your next email I realised this is not quite true:
> BufFileTell and BufFileSeek expose the existence of segments.

Yeah, that's something that tuplestore.c itself relies on.

I always thought that the main reason practical why we have BufFile
multiplex 1GB segments concerns use of temp_tablespaces, rather than
considerations that matter only when using obsolete file systems:

/*
* We break BufFiles into gigabyte-sized segments, regardless of RELSEG_SIZE.
* The reason is that we'd like large temporary BufFiles to be spread across
* multiple tablespaces when available.
*/

Now, I tend to think that most installations that care about
performance would be better off using RAID to stripe their one temp
tablespace file system. But, I suppose this still makes sense when you
have a number of file systems that happen to be available, and disk
capacity is the main concern. PHJ uses one temp tablespace per worker,
which I further suppose might not be as effective in balancing disk
space usage.

--
Peter Geoghegan

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Pavel Stehule 2017-03-26 23:22:46 Re: New CORRESPONDING clause design
Previous Message Thomas Munro 2017-03-26 22:41:30 Re: WIP: [[Parallel] Shared] Hash