From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | Robert Haas <robertmhaas(at)gmail(dot)com> |
Cc: | Cédric Villemain <cedric(dot)villemain(dot)debian(at)gmail(dot)com>, panam <panam(at)gmx(dot)net>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: [PERFORM] Hash Anti Join performance degradation |
Date: | 2011-06-01 03:47:42 |
Message-ID: | 7913.1306900062@sss.pgh.pa.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers pgsql-performance |
Robert Haas <robertmhaas(at)gmail(dot)com> writes:
> With respect to the root of the issue (why does the anti-join take so
> long?), my first thought was that perhaps the OP was very unlucky and
> had a lot of values that hashed to the same bucket. But that doesn't
> appear to be the case.
Well, yes it is. Notice what the subquery is doing: for each row in
"box", it's pulling all matching "box_id"s from message and running a
self-join across those rows. The hash join condition is a complete
no-op. And some of the box_ids have hundreds of thousands of rows.
I'd just write it off as being a particularly stupid way to find the
max(), except I'm not sure why deleting just a few thousand rows
improves things so much. It looks like it ought to be an O(N^2)
situation, so the improvement should be noticeable but not amazing.
And if you force it to not use a hashjoin, suddenly things are better.
Nestloop should also be O(N^2) in this situation, but seemingly it
avoids whatever weird corner case we are hitting here.
As Cedric says, the lack of any CHECK_FOR_INTERRUPTS in this loop is
also problematic. I'm not sure that right there is an ideal place
to put it, but we need one somewhere in the loop.
regards, tom lane
From | Date | Subject | |
---|---|---|---|
Next Message | Craig Ringer | 2011-06-01 03:50:38 | Re: BUG #6046: select current_date crashes postgres |
Previous Message | Tom Lane | 2011-06-01 03:34:32 | Re: Any idea for serializing INSERTING SERIAL column? |
From | Date | Subject | |
---|---|---|---|
Next Message | Greg Smith | 2011-06-01 06:14:11 | Re: Delete performance |
Previous Message | Robert Haas | 2011-06-01 02:58:16 | Re: [PERFORM] Hash Anti Join performance degradation |