From: | Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> |
---|---|
To: | Dilip kumar <dilip(dot)kumar(at)huawei(dot)com> |
Cc: | Magnus Hagander <magnus(at)hagander(dot)net>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Jan Lentfer <Jan(dot)Lentfer(at)web(dot)de>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Sawada Masahiko <sawada(dot)mshk(at)gmail(dot)com>, Euler Taveira <euler(at)timbira(dot)com(dot)br> |
Subject: | Re: TODO : Allow parallel cores to be used by vacuumdb [ WIP ] |
Date: | 2014-11-23 06:45:00 |
Message-ID: | CAA4eK1JVTFZr-Uvz=hsq_kFCye_vqqRJLdq1-4v2F120FNaEmw@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Mon, Nov 17, 2014 at 8:55 AM, Dilip kumar <dilip(dot)kumar(at)huawei(dot)com> wrote:
>
> On 13 November 2014 15:35 Amit Kapila Wrote,
> >As mentioned by you offlist that you are not able reproduce this
>
> >issue, I have tried again and what I observe is that I am able to
>
> >reproduce it only on *release* build and some cases work without
>
> >this issue as well,
>
> >example:
>
> >./vacuumdb --analyze-in-stages -t t1 -t t2 -t t3 -t t4 -t t5 -t t6 -t t7
-t t8 -j 8 -d postgres
>
> >Generating minimal optimizer statistics (1 target)
>
> >Generating medium optimizer statistics (10 targets)
>
> >Generating default (full) optimizer statistics
>
>
> >So to me, it looks like this is a timing issue and please notice
>
> >why in error the statement looks like
>
> >"ANALYZE ng minimal optimizer statistics (1 target)". I think this
>
> >is not a valid statement.
>
> >Let me know if you still could not reproduce it.
>
>
> I will review the code and try to find the same, meanwhile if you can
find some time to debug this, it will be really helpful.
>
I think I have found the problem and fix for the same.
The stacktrace of crash is as below:
#0 0x00000080108cf3a4 in .strlen () from /lib64/libc.so.6
#1 0x00000080108925bc in ._IO_vfprintf () from /lib64/libc.so.6
#2 0x00000080108bc1e0 in .__GL__IO_vsnprintf_vsnprintf () from
/lib64/libc.so.6
#3 0x00000fff7e581590 in .appendPQExpBufferVA () from
/data/akapila/workspace/master/installation/lib/libpq.so.5
#4 0x00000fff7e581774 in .appendPQExpBuffer () from
/data/akapila/workspace/master/installation/lib/libpq.so.5
#5 0x0000000010003748 in .run_parallel_vacuum ()
#6 0x0000000010003f60 in .vacuum_parallel ()
#7 0x0000000010002ae4 in .main ()
(gdb) f 5
#5 0x0000000010003748 in .run_parallel_vacuum ()
So now the real reason here is that the list of tables passed to
function is corrupted. The below code seems to be the real
culprit:
vacuum_parallel()
{
..
if (!tables || !tables->head)
{
SimpleStringList dbtables = {NULL, NULL};
...
..
tables = &dbtables;
}
..
}
In above code dbtables is local to if loop and code
is using the address of same to assign it to tables which
is used out of if block scope, moving declaration to the
outer scope fixes the problem in my environment. Find the
updated patch that fixes this problem attached with this
mail. Let me know your opinion about the same.
While looking at this problem, I have noticed couple of other
improvements:
a. In prepare_command() function, patch is doing init of sql
buffer (initPQExpBuffer(sql);) which I think is not required
as both places from where this function is called, it is done by
caller. I think this will lead to memory leak.
b. In prepare_command() function, for fixed strings you can
use appendPQExpBufferStr() which is what used in original code
as well.
c.
run_parallel_vacuum()
{
..
prepare_command(connSlot[free_slot].connection, full, verbose,
and_analyze, analyze_only, freeze, &sql);
appendPQExpBuffer(&sql, " %s", cell->val);
..
}
I think it is better to end command with ';' by using
appendPQExpBufferStr(&sql, ";"); in above code.
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com
Attachment | Content-Type | Size |
---|---|---|
vacuumdb_parallel_v17.patch | application/octet-stream | 23.6 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | David Rowley | 2014-11-23 11:23:45 | Re: Patch to support SEMI and ANTI join removal |
Previous Message | Tom Lane | 2014-11-23 00:17:29 | Re: [PATCH] Simplify EXISTS subqueries containing LIMIT |