From: | Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com> |
---|---|
To: | Laurenz Albe <laurenz(dot)albe(at)cybertec(dot)at>, pgsql-hackers(at)lists(dot)postgresql(dot)org |
Subject: | Re: Memory leak in incremental sort re-scan |
Date: | 2023-06-15 13:19:47 |
Message-ID: | 99e584a1-2f55-a56d-efb1-d38c57eb2beb@enterprisedb.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hi,
On 6/15/23 13:48, Laurenz Albe wrote:
> ExecIncrementalSort() calls tuplesort_begin_common(), which creates the "TupleSort main"
> and "TupleSort sort" memory contexts, and ExecEndIncrementalSort() calls tuplesort_end(),
> which destroys them.
> But ExecReScanIncrementalSort() only resets the memory contexts. Since the next call to
> ExecIncrementalSort() will create them again, we end up leaking these contexts for every
> re-scan.
>
> Here is a reproducer with the regression test database:
>
> SET enable_sort = off;
> SET enable_hashjoin = off;
> SET enable_mergejoin = off;
> SET enable_material = off;
>
> SELECT t.unique2, t2.r
> FROM tenk1 AS t
> JOIN (SELECT unique1,
> row_number() OVER (ORDER BY hundred, thousand) AS r
> FROM tenk1
> OFFSET 0) AS t2
> ON t.unique1 + 0 = t2.unique1
> WHERE t.unique1 < 1000;
>
> The execution plan:
>
> Nested Loop
> Join Filter: ((t.unique1 + 0) = tenk1.unique1)
> -> Bitmap Heap Scan on tenk1 t
> Recheck Cond: (unique1 < 1000)
> -> Bitmap Index Scan on tenk1_unique1
> Index Cond: (unique1 < 1000)
> -> WindowAgg
> -> Incremental Sort
> Sort Key: tenk1.hundred, tenk1.thousand
> Presorted Key: tenk1.hundred
> -> Index Scan using tenk1_hundred on tenk1
>
>
> A memory context dump at the end of the execution looks like this:
>
> ExecutorState: 262144 total in 6 blocks; 74136 free (29 chunks); 188008 used
> TupleSort main: 32832 total in 2 blocks; 7320 free (0 chunks); 25512 used
> TupleSort sort: 8192 total in 1 blocks; 7928 free (0 chunks); 264 used
> Caller tuples: 8192 total in 1 blocks (0 chunks); 7984 free (0 chunks); 208 used
> TupleSort main: 32832 total in 2 blocks; 7256 free (0 chunks); 25576 used
> TupleSort sort: 8192 total in 1 blocks; 7928 free (0 chunks); 264 used
> Caller tuples: 8192 total in 1 blocks (0 chunks); 7984 free (0 chunks); 208 used
> TupleSort main: 32832 total in 2 blocks; 7320 free (0 chunks); 25512 used
> TupleSort sort: 8192 total in 1 blocks; 7928 free (0 chunks); 264 used
> Caller tuples: 8192 total in 1 blocks (0 chunks); 7984 free (0 chunks); 208 used
> [many more]
> 1903 more child contexts containing 93452928 total in 7597 blocks; 44073240 free (0 chunks); 49379688 used
>
>
> The following patch fixes the problem for me:
>
> --- a/src/backend/executor/nodeIncrementalSort.c
> +++ b/src/backend/executor/nodeIncrementalSort.c
> @@ -1145,21 +1145,16 @@ ExecReScanIncrementalSort(IncrementalSortState *node)
> node->execution_status = INCSORT_LOADFULLSORT;
>
> /*
> - * If we've set up either of the sort states yet, we need to reset them.
> - * We could end them and null out the pointers, but there's no reason to
> - * repay the setup cost, and because ExecIncrementalSort guards presorted
> - * column functions by checking to see if the full sort state has been
> - * initialized yet, setting the sort states to null here might actually
> - * cause a leak.
> + * Release tuplesort resources.
> */
> if (node->fullsort_state != NULL)
> {
> - tuplesort_reset(node->fullsort_state);
> + tuplesort_end(node->fullsort_state);
> node->fullsort_state = NULL;
> }
> if (node->prefixsort_state != NULL)
> {
> - tuplesort_reset(node->prefixsort_state);
> + tuplesort_end(node->prefixsort_state);
> node->prefixsort_state = NULL;
> }
>
>
> The original comment hints that this might mot be the correct thing to do...
>
I think it's correct, but I need to look at the code more closely - it's
been a while. The code is a bit silly, as it resets the tuplesort and
then throws away all the pointers - so what could the _end() break?
AFAICS the comment says that we can't just do tuplesort_reset and keep
the pointers, because some other code depends on them being NULL.
In hindsight, that's a bit awkward - it'd probably be better to have a
separate flag, which would allow us to just reset the tuplesort.
regards
--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2023-06-15 13:21:38 | Re: When IMMUTABLE is not. |
Previous Message | torikoshia | 2023-06-15 13:00:46 | Re: RFC: Logging plan of the running query |