From: | Rafia Sabih <rafia(dot)sabih(at)enterprisedb(dot)com> |
---|---|
To: | Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com> |
Cc: | Ashutosh Bapat <ashutosh(dot)bapat(at)enterprisedb(dot)com>, Peter Geoghegan <pg(at)heroku(dot)com>, Haribabu Kommi <kommi(dot)haribabu(at)gmail(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: WIP: [[Parallel] Shared] Hash |
Date: | 2017-02-02 03:57:30 |
Message-ID: | CAOGQiiNk5Uri44t+jS5Z3rMTEKshhcTdDEB33JRM=kYSXNwpYw@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Thu, Feb 2, 2017 at 1:19 AM, Thomas Munro
<thomas(dot)munro(at)enterprisedb(dot)com> wrote:
> On Thu, Feb 2, 2017 at 3:34 AM, Rafia Sabih
> <rafia(dot)sabih(at)enterprisedb(dot)com> wrote:
>> 9 | 62928.88 | 59077.909
>
> Thanks Rafia. At first glance this plan is using the Parallel Shared
> Hash in one place where it should pay off, that is loading the orders
> table, but the numbers are terrible. I noticed that it uses batch
> files and then has to increase the number of batch files, generating a
> bunch of extra work, even though it apparently overestimated the
> number of rows, though that's only ~9 seconds of ~60. I am
> investigating.
Hi Thomas,
Apart from the previously reported regression, there appear one more
issue in this set of patches. At times, running a query using parallel
hash it hangs up and all the workers including the master shows the
following backtrace,
#0 0x00003fff880c7de8 in __epoll_wait_nocancel () from /lib64/power8/libc.so.6
#1 0x00000000104e2718 in WaitEventSetWaitBlock (set=0x100157bde90,
cur_timeout=-1, occurred_events=0x3fffdbe69698, nevents=1) at
latch.c:998
#2 0x00000000104e255c in WaitEventSetWait (set=0x100157bde90,
timeout=-1, occurred_events=0x3fffdbe69698, nevents=1,
wait_event_info=134217745) at latch.c:950
#3 0x0000000010512970 in ConditionVariableSleep (cv=0x3ffd736e05a4,
wait_event_info=134217745) at condition_variable.c:132
#4 0x00000000104dbb1c in BarrierWaitSet (barrier=0x3ffd736e0594,
new_phase=1, wait_event_info=134217745) at barrier.c:97
#5 0x00000000104dbb9c in BarrierWait (barrier=0x3ffd736e0594,
wait_event_info=134217745) at barrier.c:127
#6 0x00000000103296a8 in ExecHashShrink (hashtable=0x3ffd73747dc0) at
nodeHash.c:1075
#7 0x000000001032c46c in dense_alloc_shared
(hashtable=0x3ffd73747dc0, size=40, shared=0x3fffdbe69eb8,
respect_work_mem=1 '\001') at nodeHash.c:2618
#8 0x000000001032a2f0 in ExecHashTableInsert
(hashtable=0x3ffd73747dc0, slot=0x100158f9e90, hashvalue=2389907270)
at nodeHash.c:1476
#9 0x0000000010327fd0 in MultiExecHash (node=0x100158f9800) at nodeHash.c:296
#10 0x0000000010306730 in MultiExecProcNode (node=0x100158f9800) at
execProcnode.c:577
The issue is not deterministic and straightforwardly reproducible,
sometimes after make clean, etc. queries run sometimes they hang up
again. I wanted to bring this to your notice hoping you might be
faster than me in picking up the exact reason behind this anomaly.
--
Regards,
Rafia Sabih
EnterpriseDB: http://www.enterprisedb.com/
From | Date | Subject | |
---|---|---|---|
Next Message | Thomas Munro | 2017-02-02 04:05:08 | Re: WIP: [[Parallel] Shared] Hash |
Previous Message | Amit Kapila | 2017-02-02 03:53:57 | Re: parallelize queries containing subplans |