Re: UPDATE using 3 medium-sized tables causes runaway hash table and fills disk

From: Thom Brown <thom(at)linux(dot)com>
To: Rowan Collins <rowan(dot)collins(at)gmail(dot)com>
Cc: PGSQL Mailing List <pgsql-general(at)postgresql(dot)org>
Subject: Re: UPDATE using 3 medium-sized tables causes runaway hash table and fills disk
Date: 2013-04-29 14:23:48
Message-ID: CAA-aLv5AELEx08L4MEJPDhUkpHs2Ymu=aCgGbcqg3ubSOCRHgg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On 26 April 2013 15:39, Rowan Collins <rowan(dot)collins(at)gmail(dot)com> wrote:
> Hi All,
>
> I've come upon some very strange behaviour with an UPDATE query which causes
> Postgres to consume all the disk space on the server for no apparent reason.
>
> Basically, I'm trying to run an UPDATE involving three medium-sized tables
> (~60k rows each), which generates a query plan with three Hash Joins. But
> when I try to execute it, Postgres appears to go into some kind of loop,
> gradually filling up the disk partition. After a long wait it responds with
> "ERROR: could not write to hash-join temporary file: No space left on
> device; SQL state: 53100"; the partition in question has 9.5GB free at the
> beginning of the query - that's a lot of hash file!
>
> If I run "ANALYZE temp_fares_mappings;" - the table which is being Updated,
> and is the outermost in the query plan - the problem goes away *even though
> the Query Plan hasn't changed*.
>
> I'm not entirely sure how to simplify the query and still reproduce the
> issue, but I've produced an anonymized data sample and SQL query at
> http://rwec.co.uk/x/break_postgres.zip In this case, "Analyze" *does* change
> the query plan (as well as fixing the issue), but I can consistently
> reproduce the disk-filling behaviour using this sample on Postgres 9.0 and
> 9.2.
>
> Note that it doesn't appear to matter if one or all of the tables are
> permanent, as I have been using various combinations for debugging, and
> always get the same behaviour. Trying to write a corresponding SELECT
> statement doesn't seem to generate the same issue, at least with the queries
> I've tried.
>
> - The plan for the real query is here: http://explain.depesz.com/s/WTP
> - Sample data and SQL query: http://rwec.co.uk/x/break_postgres.zip
> - Query plan for sample data, without running Analyze, is here:
> http://explain.depesz.com/s/qsH
> - Postgres version strings: "PostgreSQL 9.0.5 on x86_64-pc-linux-gnu,
> compiled by GCC x86_64-pc-linux-gnu-gcc (Gentoo 4.5.2 p1.1, pie-0.4.5)
> 4.5.2, 64-bit" and "PostgreSQL 9.2.1 on x86_64-pc-linux-gnu, compiled by
> x86_64-pc-linux-gnu-gcc (Gentoo Hardened 4.5.4 p1.0, pie-0.4.7) 4.5.4,
> 64-bit"
>
> Does anyone have any idea what's going on here, and whether it is in fact a
> bug? It certainly feels like one...
> Thanks for your time.

You done gone broke Postgres. The same occurs when converting your
UPDATE query into a SELECT when I try it:

SELECT
*
FROM
test_mappings AS LFM, test_low_fares AS LF
JOIN
test_cob_offers AS CO
ON
CO.num_adults_searched = LF.number_in_party
AND
CO.num_children_searched = 0
AND
CO.num_infants_searched = 0
WHERE
LFM.low_fares_row_id = LF.low_fares_row_id
AND
CO.central_offer_id = LFM.central_offer_id
AND
CO.credential_set_id = LFM.credential_set_id
AND
LFM.cob_offer_id IS NULL;

Well something appears to be tangling up the executor.
--
Thom

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Shaun Thomas 2013-04-29 16:04:51 Re: UPDATE using 3 medium-sized tables causes runaway hash table and fills disk
Previous Message Willy-Bas Loos 2013-04-29 12:01:25 Re: don't i need a -modules package for londiste3?