Re: I feel the need for speed. What am I doing wrong?

From: "Dann Corbit" <DCorbit(at)connx(dot)com>
To: "Nigel J(dot) Andrews" <nandrews(at)investsystems(dot)co(dot)uk>
Cc: <pgsql-hackers(at)postgresql(dot)org>, <pgsql-general(at)postgresql(dot)org>
Subject: Re: I feel the need for speed. What am I doing wrong?
Date: 2003-01-07 01:10:41
Message-ID: D90A5A6C612A39408103E6ECDD77B829408A23@voyager.corporate.connx.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

> -----Original Message-----
> From: Nigel J. Andrews [mailto:nandrews(at)investsystems(dot)co(dot)uk]
> Sent: Monday, January 06, 2003 4:58 PM
> To: Dann Corbit
> Cc: pgsql-hackers(at)postgresql(dot)org; pgsql-general(at)postgresql(dot)org
> Subject: Re: [HACKERS] I feel the need for speed. What am I
> doing wrong?
>
> Added -general list so that the next followup can remove
> -hackers and everyone there will have had notice.
>
>
> On Mon, 6 Jan 2003, Dann Corbit wrote:
> >
> > I have a query using two postgres tables.
> > One is called "CNX_DS_53_SIS_STU_OPT_FEE_TB" and the other is called
> > "CNX_DS2_53_SIS_STU_OPT_FEE_TB".
> >
> > I am getting 3 times slower performance than Microsoft Access when
> > performing a left outer join.
> >
> > ...
> >
> > Here is the query:
> > select a."RT_REC_KEY", a."cnxarraycolumn", a."CRC" from
> > "CNX_DS_53_SIS_STU_OPT_FEE_TB" a left outer join
> > "CNX_DS2_53_SIS_STU_OPT_FEE_TB" b on ( a."RT_REC_KEY" =
> b."RT_REC_KEY"
> > and a."cnxarraycolumn" = b."cnxarraycolumn") where b.oid is null ;
> >
> >
> > Creating the following index had no effect on performance! create
> > unique index i1 on "CNX_DS2_53_SIS_STU_OPT_FEE_TB" ("RT_REC_KEY",
> > "cnxarraycolumn", "CRC");
> >
> > Both tables had 6139062 rows of data.
> >
> > In this query ... all rows of data match perfectly, so no
> results are
> > returned.
>
> I suspect you get no results because it's unlikely b.oid will
> be null. Are you sure the query is how it should be since you
> seem to be expecting no rows to be returned and yet your
> reason for that doesn't match the query as shown. Without the
> oid test I'd bet you get a result set of 6139062 rows.

There are supposed to be no results for this particular query.
The data is unique with only RT_REC_KEY and cnxarraycolumn (I tried
using that as an index also).

The basic gist of it is like this:

I want to know where the keys match (e.g.: RT_REC_KEY + cnxarraycolumn)
but the CRC has changed (which will tell me what data has changed).
This is for a data synchronization system that uses PostgreSQL as a join
engine. I store primary key data together with a 64 bit CRC in data
tables stored in PostgreSQL. In this particular case, there were no
changes but there will be changes at other times.

> > Is there a way to reformulate this query so that it will use the
> > index?
>
> Given the above comment I'd say no since the entirety of both
> tables will be tested to make the result set.
>
> Alternatively, if the query is right try something along the lines of:
>
> SELECT a.blah, a.foo,
> FROM a, b
> WHERE a.blah = b.blah AND a.foo = b.foo AND b.oid IS NULL

OID is never null. I don't think that this query is equivalent. This
Oid is just the standard PostgreSQL Oid, and not some arbitrary field.

> if that doesn't use a query try pushing the null test into a
> subselect like:
>
> SELECT a.blah, a.foo,
> FROM a, (SELECT * FROM b WHERE oid IS NULL) b
> WHERE a.blah = b.blah AND a.foo = b.foo
>

OID is never null. I don't think that this query is equivalent.

> After that let's hope I haven't embarrassed myself.

Probably, I did not explain myself clearly enough.

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2003-01-07 01:21:40 Re: I feel the need for speed. What am I doing wrong?
Previous Message Marc G. Fournier 2003-01-07 01:04:38 Re: New Portal in Place, DNS switched ...