Re: BUG #14344: string_agg(DISTINCT ..) crash

From: Peter Geoghegan <pg(at)heroku(dot)com>
To: David Rowley <david(dot)rowley(at)2ndquadrant(dot)com>
Cc: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Regina Obe <lr(at)pcorp(dot)us>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL mailing lists <pgsql-bugs(at)postgresql(dot)org>
Subject: Re: BUG #14344: string_agg(DISTINCT ..) crash
Date: 2016-09-29 15:30:53
Message-ID: CAM3SWZQxeF5r4=U9-ikgn86Zun+LbcRfv6gQ0SWdWPQ5XVC6BQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

On Thu, Sep 29, 2016 at 1:10 PM, Peter Geoghegan <pg(at)heroku(dot)com> wrote:
> I have been able to reproduce the problem, in a fashion, so can now
> probably debug it easily enough. Specifically, I can get Valgrind to
> complain about Regina's test case.

I see what the problem is. This use-after-free bug occurs because, for
some reason, tuplesort_gettupleslot()'s contract was not updated by me
alongside analogous routines like tuplesort_getindextuple() as part of
the batch memory commit. This meant that at least one particular case
(process_ordered_aggregate_multi() calls to tuplesort_gettupleslot(),
with distinct columns) felt entitled to reuse some slot's tuple, a
tuple located in memory managed by tuplesort.c as batch memory. The
reuse was across calls to tuplesort_gettupleslot().

Attached patch fixes the bug, and updates the contract of
tuplesort_gettupleslot() so it's no longer the odd one out. However,
I'm not recommending this be committed without further discussion,
because the performance overhead could be a concern. Also, it looks
like hypothetical_dense_rank_final() requires similar treatment, which
isn't addressed by this patch. I did audit other callers of
tuplesort_gettupleslot() and analogous routines for other classes of
caller-tuple (e.g., tuplesort_getindextuple() callers were examined
too). Look like it's just these two tuplesort callers that are
affected.

How much could it hurt performance to fix the bug in this way? Does
anyone see an alternative? Pushing knowledge of this special case into
tuplesort.c seems like an unappealing alternative.

--
Peter Geoghegan

Attachment Content-Type Size
0001-Fix-use-after-free-around-DISTINCT-transition-functi.patch text/x-patch 2.5 KB

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Tom Lane 2016-09-29 15:40:46 Re: BUG #14344: string_agg(DISTINCT ..) crash
Previous Message Kevin Grittner 2016-09-29 14:48:19 Re: BUG #14345: run multi-PG9.6 on One Host, hang when checkpoint