From: | Ali Akbar <the(dot)apaan(at)gmail(dot)com> |
---|---|
To: | Tomas Vondra <tv(at)fuzzy(dot)cz> |
Cc: | "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: PATCH: decreasing memory needlessly consumed by array_agg |
Date: | 2014-12-20 08:26:51 |
Message-ID: | CACQjQLrsbbN=MQxOAAnS1k10gkK3H-EH3ERHELkWmqp-dh2dNA@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
2014-12-16 11:01 GMT+07:00 Ali Akbar <the(dot)apaan(at)gmail(dot)com>:
>
>
>
> 2014-12-16 10:47 GMT+07:00 Ali Akbar <the(dot)apaan(at)gmail(dot)com>:
>>
>>
>> 2014-12-16 6:27 GMT+07:00 Tomas Vondra <tv(at)fuzzy(dot)cz>:
>> Just fast-viewing the patch.
>>
>> The patch is not implementing the checking for not creating new context
>> in initArrayResultArr. I think we should implement it also there for
>> consistency (and preventing future problems).
>>
>
Testing the performance with your query, looks promising: speedup is
between 12% ~ 15%.
Because i'm using 32-bit systems, setting work_mem to 1024GB failed:
> ERROR: 1073741824 is outside the valid range for parameter "work_mem" (64
> .. 2097151)
> STATEMENT: SET work_mem = '1024GB';
> psql:/media/truecrypt1/oss/postgresql/postgresql/../patches/array-agg.sql:20:
> ERROR: 1073741824 is outside the valid range for parameter "work_mem" (64
> .. 2097151)
>
Maybe because of that, in the large groups a test, the speedup is awesome:
> master: 16,819 ms
>
with patch: 1,720 ms
>
Looks like with master, postgres resort to disk, but with the patch it fits
in memory.
Note: I hasn't tested the large dataset.
As expected, testing array_agg(anyarray), the performance is still the
same, because the subcontext hasn't implemented there (test script modified
from Tomas', attached).
I implemented the subcontext checking in initArrayResultArr by changing the
v3 patch like this:
> +++ b/src/backend/utils/adt/arrayfuncs.c
> @@ -4797,10 +4797,11 @@ initArrayResultArr(Oid array_type, Oid
> element_type, MemoryContext rcontext,
> bool subcontext)
> {
> ArrayBuildStateArr *astate;
> - MemoryContext arr_context;
> + MemoryContext arr_context = rcontext; /* by default use the parent
> ctx */
>
> /* Make a temporary context to hold all the junk */
> - arr_context = AllocSetContextCreate(rcontext,
> + if (subcontext)
> + arr_context = AllocSetContextCreate(rcontext,
> "accumArrayResultArr",
> ALLOCSET_DEFAULT_MINSIZE,
> ALLOCSET_DEFAULT_INITSIZE,
>
Testing the performance, it got the 12%~15% speedup. Good. (patch attached)
Looking at the modification in accumArrayResult* functions, i don't really
> comfortable with:
>
> 1. Code that calls accumArrayResult* after explicitly calling
> initArrayResult* must always passing subcontext, but it has no effect.
> 2. All existing codes that calls accumArrayResult must be changed.
>
> Just an idea: why don't we minimize the change in API like this:
>
> 1. Adding parameter bool subcontext, only in initArrayResult*
> functions but not in accumArrayResult*
> 2. Code that want to not creating subcontext must calls
> initArrayResult* explicitly.
>
> Other codes that calls directly to accumArrayResult can only be changed in
> the call to makeArrayResult* (with release=true parameter). In places that
> we don't want to create subcontext (as in array_agg_transfn), modify it to
> use initArrayResult* before calling accumArrayResult*.
>
> What do you think?
>
As per your concern about calling initArrayResult* with subcontext=false,
while makeArrayResult* with release=true:
> Also, it seems that using 'subcontext=false' and then 'release=true'
> would be a bug. Maybe it would be appropriate to store the
> 'subcontext' value into the ArrayBuildState and then throw an error
> if makeArrayResult* is called with (release=true && subcontext=false).
>
Yes, i think we should do that to minimize unexpected coding errors. In
makeArrayResult*, i think its better to not throwing an error, but using
assertions:
> Assert(release == false || astate->subcontext == true);
>
Regards,
--
Ali Akbar
Attachment | Content-Type | Size |
---|---|---|
array-agg-anyarray.sql | application/sql | 3.3 KB |
array-agg-v3-modified.patch | text/x-diff | 15.7 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Martijn van Oosterhout | 2014-12-20 10:16:47 | Re: INSERT ... ON CONFLICT {UPDATE | IGNORE} |
Previous Message | Christian Ullrich | 2014-12-20 08:06:13 | Re: New Python vs. old PG on raccoon and jaguarundi |