Re: Slow sub-selects, max and count(*)

From: Bruno Wolff III <bruno(at)wolff(dot)to>
To: Richard Sydney-Smith <richard(at)ibisaustralia(dot)com>
Cc: pgsql-sql(at)postgresql(dot)org
Subject: Re: Slow sub-selects, max and count(*)
Date: 2004-02-06 03:39:41
Message-ID: 20040206033941.GA16374@wolff.to
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-sql

On Thu, Feb 05, 2004 at 15:53:08 +0800,
Richard Sydney-Smith <richard(at)ibisaustralia(dot)com> wrote:
> Trial Solutions:
>
> 1) delete from fsechist where hist_q in (select hist_q from fsechist, temp_shareprices where hist_tick = ticker and dte = hist_date);

Don't you want:
delete from fsechist where hist_q in (select hist_q from temp_shareprices
where hist_tick = ticker and dte = hist_date);

You should be able to refer to the table being deleted instead of joining
to it again in the subselect. As long as you are using 7.4.x, this should
be pretty fast.

>
> far too slow

Since this is slower than the exists solutions, my guess is that you are
using a version prior to 7.4.

>
> 2)
>
> update fsechist set hist_tick = \'@@\' from fsechist h, temp_shareprices t where t.ticker = h.hist_tick and h.hist_date = t.dte;
> delete from fsechist where hist_tick = \'@@\';
>
> really , really slow. Thought this would use the indexes but "explain" ( being my friend) still gives the query plan as a sequential scan

Again don't add fsechist to the from list. This joins with fsechist twice.

>
> 3)
>
> -- does not allow insert of a single company data
> delete from fsechist where hist_date in (select distinct dte from temp_shareprices);
>
> Works quick enough but has limited functionality. ( does not cope with imports other than a full end of trading)

I think using the distinct clause is going to slow this down, especially
if you are using a version prior to 7.4 since a sort is used instead of
a hash table to get unique values.

>
> 4)
>
> delete from fsechist where exists(select 1 from fsechist, temp_shareprices where hist_tick = ticker and dte = hist_date);
>
> Tried and still took over 60 secs before I cancelled the request.
> Indexes

Again, drop fsechist from the exists subselect.

>
> Both tables have indexes defined for tick and date. tick and date of the same data type in both cases.

You can also do a join using the nonstandard implied from syntax.
You can do something like:
delete from fsechist where hist_tick = temp_shareprices.ticker and
temp_shareprices.dte = hist_date;

In response to

Browse pgsql-sql by date

  From Date Subject
Next Message Josh Berkus 2004-02-06 03:59:43 Re: Slow sub-selects, max and count(*)
Previous Message Chris Anderson 2004-02-05 23:04:28 Could not convert UTF-8 to ISO8859-1