From: | Wes <wespvp(at)syntegra(dot)com> |
---|---|
To: | Greg Stark <gsstark(at)mit(dot)edu> |
Cc: | Jonel Rienton <jonel(at)rientongroup(dot)com>, 'Postgresql-General' <pgsql-general(at)postgresql(dot)org> |
Subject: | Re: Finding orphan records |
Date: | 2006-01-12 17:15:13 |
Message-ID: | BFEBEAC1.1C5BC%wespvp@syntegra.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
On 1/12/06 10:37 AM, "Greg Stark" <gsstark(at)mit(dot)edu> wrote:
> If it's only a factor of 3-4 then the merge join should be faster. If it's
> really two orders of magnitude (100x?) then the nested loop below would be
> faster. I think in 8.1 (and I think in 8.0 too) the planner is capable of
> coming up with both plans for the NOT IN query though.
Yep, it's really two orders of magnitude. A is about 4 million records. B
is around 200 million, and C is about 800 million records. That's why I
figured a sequential pass on A with index lookups on B and C would be
fastest. It took about 3.25 hours to complete the SELECT.
I'm working on the 8.1 upgrade now. My biggest concern is making sure no
bugs have crept in to ecpg that would affect us (bitten once before, so
being careful). So far everything else is looking good with 8.1.
I'm pondering dumping the keys for A to a file, and B and C to another file,
sorting with uniqueness on the B/C file, then programmatically determining
which keys have been freed. I'm pretty sure this will be much faster, but I
had hoped to avoid an external process like this.
Wes
From | Date | Subject | |
---|---|---|---|
Next Message | Amédée | 2006-01-12 17:15:23 | Locales problems with debian sarge3.1 |
Previous Message | codeWarrior | 2006-01-12 16:55:19 | Re: Returning SQL statement |