Re: hash joins are causing no space left error

From: Ayub M <hiayub(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-general(at)lists(dot)postgresql(dot)org
Subject: Re: hash joins are causing no space left error
Date: 2020-08-15 12:47:56
Message-ID: CAOS0qEtwtBuKHGViBiqEHSbSzMNZ3d8XgMtqXu2CcHK4QWtPVg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Wed, Aug 12, 2020 at 8:06 PM Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:

> Ayub M <hiayub(at)gmail(dot)com> writes:
> > This is PostgreSQL 11 on AWS, there is a mview query in this OLAP
> database,
> > the tables involved are huge - 50-100m records on average records
> hundreds
> > of columns in most cases. The query runs for a while and then errors out
> > saying "No space left on device". I could see it generating around 500gb
> of
> > temp file data. At times it goes thru and at times it fails - probably
> due
> > to other queries running at the same time and causing failure.
>
> Are you sure that these queries are actually producing the answers you
> want? It sounds suspiciously like you are computing underconstrained
> joins.
> --> Yes, it is as per the business requirement.
>

> > The joins are happening on around 10 tables and all are joining on the PK
> > and FK columns. I see partition pruning happening but the hash joins are
> > killing the query.
> > Is there any way to avoid hash joins?
>
> TBH, you are asking the wrong question. A merge join would take about as
> much temporary space, and a nestloop join over so much data would probably
> not finish in an amount of time you're willing to wait. Indexes are NOT
> a magic solution here. What you need to be thinking about is how to not
> need to process so much data.
>
> If you really need to have this proven to you, you can try "set
> enable_hashjoin = off", but I don't think you'll find that better.
> --> You are right, neither merge join nor nested loop are resolving the
> issue.
>

> regards, tom lane
>

--
Regards,
Ayub

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Mark Phillips 2020-08-15 19:14:29 Re: serial + db key, or guid?
Previous Message Ayub M 2020-08-15 12:47:46 Re: hash joins are causing no space left error