From: | Adam Lee <ali(at)pivotal(dot)io> |
---|---|
To: | Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com> |
Cc: | Jeff Davis <pgsql(at)j-davis(dot)com>, Melanie Plageman <melanieplageman(at)gmail(dot)com>, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, Taylor Vesely <tvesely(at)pivotal(dot)io>, Melanie Plageman <mplageman(at)pivotal(dot)io>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Memory-Bounded Hash Aggregation |
Date: | 2020-02-20 05:47:19 |
Message-ID: | 20200220054719.GB4389@earth.local |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Wed, Feb 19, 2020 at 08:16:36PM +0100, Tomas Vondra wrote:
> 5) Assert(nbuckets > 0);
> ...
> This however quickly fails on this assert in BuildTupleHashTableExt (see
> backtrace1.txt):
>
> Assert(nbuckets > 0);
>
> The value is computed in hash_choose_num_buckets, and there seem to be
> no protections against returning bogus values like 0. So maybe we should
> return
>
> Min(nbuckets, 1024)
>
> or something like that, similarly to hash join. OTOH maybe it's simply
> due to agg_refill_hash_table() passing bogus values to the function?
>
>
> 6) Another thing that occurred to me was what happens to grouping sets,
> which we can't spill to disk. So I did this:
>
> create table t2 (a int, b int, c int);
>
> -- run repeatedly, until there are about 20M rows in t2 (1GB)
> with tx as (select array_agg(a) as a, array_agg(b) as b
> from (select a, b from t order by random()) foo),
> ty as (select array_agg(a) AS a
> from (select a from t order by random()) foo)
> insert into t2 select unnest(tx.a), unnest(ty.a), unnest(tx.b)
> from tx, ty;
>
> analyze t2;
> ...
>
> which fails with segfault at execution time:
>
> tuplehash_start_iterate (tb=0x18, iter=iter(at)entry=0x2349340)
> 870 for (i = 0; i < tb->size; i++)
> (gdb) bt
> #0 tuplehash_start_iterate (tb=0x18, iter=iter(at)entry=0x2349340)
> #1 0x0000000000654e49 in agg_retrieve_hash_table_in_memory ...
>
> That's not surprising, because 0x18 pointer is obviously bogus. I guess
> this is simply an offset 18B added to a NULL pointer?
I did some investigation, have you disabled the assert when this panic
happens? If so, it's the same issue as "5) nbucket == 0", which passes a
zero size to allocator when creates that endup-with-0x18 hashtable.
Sorry my testing env goes weird right now, haven't reproduced it yet.
--
Adam Lee
From | Date | Subject | |
---|---|---|---|
Next Message | Amit Langote | 2020-02-20 06:15:42 | Re: Improve heavyweight locks instead of building new lock managers? |
Previous Message | Michael Paquier | 2020-02-20 05:23:22 | Re: pg_regress cleans up tablespace twice. |