Re: Hash Join performance

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Vamsidhar Thummala <vamsi(at)cs(dot)duke(dot)edu>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: Hash Join performance
Date: 2009-03-13 21:34:38
Message-ID: 18410.1236980078@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Vamsidhar Thummala <vamsi(at)cs(dot)duke(dot)edu> writes:
> 1) The actual time on Seq Scan on Lineitem shows that the first record is
> fetched at time 0.022ms and the last record is fetched at 53.5s. Does it
> mean the sequential scan is completed with-in first 53.4s (absolute time)?

No, it means that we spent a total of 53.5 seconds executing that plan
node and its children. There's no direct way to determine how that was
interleaved with the execution of a peer plan node. In the particular
case here, since that seqscan is the outer child of a hash join, you
can infer that all the time charged to the inner child (the Hash node
and its children) happened first, while we were building the hashtable,
which is then probed for each row of the outer relation.

> 2) Why is the Hash Join (top most) so slow?

Doesn't look that bad to me. The net time charged to the HashJoin node
is 186107.210 - 53597.555 - 112439.592 = 20070.063 msec. In addition it
would be reasonable to count the hashtable build time, which evidently
is 112439.592 - 111855.510 = 584.082 msec. So the hashtable build took
about 10 msec/row, in addition to the data fetching; and then the actual
join spent about 3 microsec per outer row, again exclusive of obtaining
those rows. The table build seems a bit slow, maybe, but I don't see a
problem with the join speed.

regards, tom lane

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Vamsidhar Thummala 2009-03-13 22:11:52 Re: Hash Join performance
Previous Message Vamsidhar Thummala 2009-03-13 21:15:07 Hash Join performance