Re: Problems with an update-from statement and pg-8.1.4

From: Rafael Martinez <r(dot)m(dot)guerrero(at)usit(dot)uio(dot)no>
To: Erik Jones <erik(at)myemma(dot)com>
Cc: Ted Allen <tallen(at)blackducksoftware(dot)com>, Stephan Szabo <sszabo(at)megazone(dot)bigpanda(dot)com>, pgsql-performance(at)postgresql(dot)org
Subject: Re: Problems with an update-from statement and pg-8.1.4
Date: 2006-12-06 20:29:26
Message-ID: 1165436967.9606.32.camel@bbking.linux
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On Wed, 2006-12-06 at 14:19 -0600, Erik Jones wrote:
> Rafael Martinez wrote:
> > On Wed, 2006-12-06 at 14:55 -0500, Ted Allen wrote:
> >
> >> Stephan Szabo wrote:
> >>
> >>> On Wed, 6 Dec 2006, Rafael Martinez wrote:
> >>>
> >>>> mailstats=# EXPLAIN update mail SET spamscore = '-5.026' FROM mail m,
> >>>> mail_received mr where mr.mail_id = m.mail_id AND mr.queue_id =
> >>>> '1GrxLs-0004N9-I1' and mr.mailhost = '129.240.10.47';
> >>>>
> >>>>
> >>> I don't think this statement does what you expect. You're ending up with
> >>> two copies of mail in the above one as "mail" and one as "m". You probably
> >>> want to remove the mail m in FROM and use mail rather than m in the
> >>> where clause.
> >>>
> >>>
> >>>
> >> Worse yet I think your setting "spamcore" for EVERY row in mail to
> >> '-5.026'. The above solution should fix it though.
> >>
> >> -- Ted
> >>
> >>
> >
> > Thanks for the answers. I think the 'problem' is explain in the
> > documentation:
> >
> > "fromlist
> >
> > A list of table expressions, allowing columns from other tables to
> > appear in the WHERE condition and the update expressions. This is
> > similar to the list of tables that can be specified in the FROMClause of
> > a SELECT statement. Note that the target table must not appear in the
> > fromlist, unless you intend a self-join (in which case it must appear
> > with an alias in the fromlist)".
> >
> > And as you said, we can not have 'mail m' in the FROM clause. I have
> > contacted the developers and they will change the statement. I gave then
> > these 2 examples:
> >
> > -------------------------------------------------------------------------------
> > mailstats=# EXPLAIN update mail SET spamscore = '-5.026' FROM
> > mail_received mr where mr.mail_id = mail.mail_id AND mr.queue_id =
> > '1GrxLs-0004N9-I1' and mr.mailhost = '129.240.10.47';
> > QUERY PLAN
> > ------------------------------------------------------------------------------------------------------
> > Nested Loop (cost=0.00..6.54 rows=1 width=57)
> > -> Index Scan using received_queue_id_index on mail_received mr
> > (cost=0.00..3.20 rows=1 width=4)
> > Index Cond: ((queue_id)::text = '1GrxLs-0004N9-I1'::text)
> > Filter: (mailhost = '129.240.10.47'::inet)
> > -> Index Scan using mail_pkey on mail (cost=0.00..3.32 rows=1
> > width=57)
> > Index Cond: ("outer".mail_id = mail.mail_id)
> > (6 rows)
> >
> > mailstats=# explain update mail SET spamscore = '-5.026' where mail_id
> > = (select mail_id from mail_received where queue_id = '1GrxLs-0004N9-I1'
> > and mailhost = '129.240.10.47');
> > QUERY PLAN
> > -----------------------------------------------------------------------------------------------------
> > Index Scan using mail_pkey on mail (cost=3.20..6.52 rows=1 width=57)
> > Index Cond: (mail_id = $0)
> > InitPlan
> > -> Index Scan using received_queue_id_index on mail_received
> > (cost=0.00..3.20 rows=1 width=4)
> > Index Cond: ((queue_id)::text = '1GrxLs-0004N9-I1'::text)
> > Filter: (mailhost = '129.240.10.47'::inet)
> > (6 rows)
> > -------------------------------------------------------------------------------
> >
> Look again at the estimated costs of those two query plans. You haven't
> gained anything there. Try this out:
>
> EXPLAIN UPDATE mail
> SET spamscore = '-5.026'
> FROM mail_received mr
> WHERE mail.mail_id = mr.mail_id AND mr.queue_id = '1GrxLs-0004N9-I1' ;
>

Haven't we?

* In the statement with problems we got this:
Nested Loop (cost=0.00..932360.78 rows=7184312 width=57)

* In the ones I sent:
Nested Loop (cost=0.00..6.54 rows=1 width=57)
Index Scan using mail_pkey on mail (cost=3.20..6.52 rows=1 width=57)

* And in the last one you sent me:
------------------------------------------------------
Nested Loop (cost=0.00..6.53 rows=1 width=57)
-> Index Scan using received_queue_id_index on mail_received mr
(cost=0.00..3.20 rows=1 width=4)
Index Cond: ((queue_id)::text = '1GrxLs-0004N9-I1'::text)
-> Index Scan using mail_pkey on mail (cost=0.00..3.32 rows=1
width=57)
Index Cond: (mail.mail_id = "outer".mail_id)
(5 rows)
------------------------------------------------------

I can not see the different.

regards,
--
Rafael Martinez, <r(dot)m(dot)guerrero(at)usit(dot)uio(dot)no>
Center for Information Technology Services
University of Oslo, Norway

PGP Public Key: http://folk.uio.no/rafael/

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Bruno Wolff III 2006-12-06 20:31:01 Re: File Systems Compared
Previous Message Brian Hurt 2006-12-06 20:24:09 Re: File Systems Compared