Quick Links

Re: Finding orphan records

From:	Wes <wespvp(at)syntegra(dot)com>
To:	Greg Stark <gsstark(at)mit(dot)edu>
Cc:	Jonel Rienton <jonel(at)rientongroup(dot)com>, 'Postgresql-General' <pgsql-general(at)postgresql(dot)org>
Subject:	Re: Finding orphan records
Date:	2006-01-12 17:15:13
Message-ID:	BFEBEAC1.1C5BC%wespvp@syntegra.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Thread:
Lists:	pgsql-general

On 1/12/06 10:37 AM, "Greg Stark" <gsstark(at)mit(dot)edu> wrote:

> If it's only a factor of 3-4 then the merge join should be faster. If it's
> really two orders of magnitude (100x?) then the nested loop below would be
> faster. I think in 8.1 (and I think in 8.0 too) the planner is capable of
> coming up with both plans for the NOT IN query though.

Yep, it's really two orders of magnitude. A is about 4 million records. B
is around 200 million, and C is about 800 million records. That's why I
figured a sequential pass on A with index lookups on B and C would be
fastest. It took about 3.25 hours to complete the SELECT.

I'm working on the 8.1 upgrade now. My biggest concern is making sure no
bugs have crept in to ecpg that would affect us (bitten once before, so
being careful). So far everything else is looking good with 8.1.

I'm pondering dumping the keys for A to a file, and B and C to another file,
sorting with uniqueness on the B/C file, then programmatically determining
which keys have been freed. I'm pretty sure this will be much faster, but I
had hoped to avoid an external process like this.

Wes

In response to

Re: Finding orphan records at 2006-01-12 16:37:59 from Greg Stark

Browse pgsql-general by date

	From	Date	Subject
Next Message	Amédée	2006-01-12 17:15:23	Locales problems with debian sarge3.1
Previous Message	codeWarrior	2006-01-12 16:55:19	Re: Returning SQL statement