Re: Optimal time series sampling.

From: "Scott Marlowe" <scott(dot)marlowe(at)gmail(dot)com>
To: "Ted Byers" <r(dot)ted(dot)byers(at)rogers(dot)com>
Cc: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "Gregory Stark" <stark(at)enterprisedb(dot)com>, pgsql-general(at)postgresql(dot)org
Subject: Re: Optimal time series sampling.
Date: 2007-11-09 19:22:51
Message-ID: dcc563d10711091122o29956187ga5243db9394b5910@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Nov 9, 2007 11:47 AM, Ted Byers <r(dot)ted(dot)byers(at)rogers(dot)com> wrote:
>
> --- Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>
> > Ted Byers <r(dot)ted(dot)byers(at)rogers(dot)com> writes:
> > > OK, it is challenging to present it in plain text,
> > but
> > > here is the HTML exported by MySQL Query Browser.
> >
> > Why are you asking this list for help with a MySQL
> > performance problem?
> >
> because my question isn't really about MySQL, but
> rather about how best to construct the SQL required to
> get the job done, regardless of what database is used.
> I have seen some claims that it is better to use
> joins instead of correlated subqueries and others that
> say the opposite. And I do not, at this stage, know
> if there are other options in SQL that may or may not
> be better.

Which is better depends largely on how your database is built. MySQL
still uses loops for all subselects, so with large numbers of tuples
in the subselect method, it will be slow. But they might fix this in
a later release.

Fairly recent versions of PostgreSQL could make some bad choices when
doing joins for certain datasets that would be much faster with a
correlated subquery (specifically the old left join where
righttable.field is null trick made some pgsql versions choose an
inefficient join method)

So, the "right" way is a question of which db, and even which version
of that DB you're on.

> At this time, the database in use is
> irrelevant (I want to stick as close to the ANSI
> standard as practicable so the rewriting required will
> be minimal should we decide to change the database
> later, for whatever reason).

If you want to stick with ANSI, MySQL tends to be more divergent from
the spec than pgsql and other rdbms.

Most people would consider the correlate subquery the better method.
But it's also likely to be the slowest on MySQL.

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Nathan Wilhelmi 2007-11-09 20:00:41 Importance of CPU floating point performance...
Previous Message Scott Marlowe 2007-11-09 19:18:18 Re: Insert statements really slow