Re: Problems with an update-from statement and pg-8.1.4

From: Erik Jones <erik(at)myemma(dot)com>
To: Rafael Martinez <r(dot)m(dot)guerrero(at)usit(dot)uio(dot)no>
Cc: Ted Allen <tallen(at)blackducksoftware(dot)com>, Stephan Szabo <sszabo(at)megazone(dot)bigpanda(dot)com>, pgsql-performance(at)postgresql(dot)org
Subject: Re: Problems with an update-from statement and pg-8.1.4
Date: 2006-12-06 20:31:06
Message-ID: 4577288A.8080800@myemma.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Rafael Martinez wrote:
> On Wed, 2006-12-06 at 14:19 -0600, Erik Jones wrote:
>
>> Rafael Martinez wrote:
>>
>>> On Wed, 2006-12-06 at 14:55 -0500, Ted Allen wrote:
>>>
>>>
>>>> Stephan Szabo wrote:
>>>>
>>>>
>>>>> On Wed, 6 Dec 2006, Rafael Martinez wrote:
>>>>>
>>>>>
>>>>>> mailstats=# EXPLAIN update mail SET spamscore = '-5.026' FROM mail m,
>>>>>> mail_received mr where mr.mail_id = m.mail_id AND mr.queue_id =
>>>>>> '1GrxLs-0004N9-I1' and mr.mailhost = '129.240.10.47';
>>>>>>
>>>>>>
>>>>>>
>>>>> I don't think this statement does what you expect. You're ending up with
>>>>> two copies of mail in the above one as "mail" and one as "m". You probably
>>>>> want to remove the mail m in FROM and use mail rather than m in the
>>>>> where clause.
>>>>>
>>>>>
>>>>>
>>>>>
>>>> Worse yet I think your setting "spamcore" for EVERY row in mail to
>>>> '-5.026'. The above solution should fix it though.
>>>>
>>>> -- Ted
>>>>
>>>>
>>>>
>>> Thanks for the answers. I think the 'problem' is explain in the
>>> documentation:
>>>
>>> "fromlist
>>>
>>> A list of table expressions, allowing columns from other tables to
>>> appear in the WHERE condition and the update expressions. This is
>>> similar to the list of tables that can be specified in the FROMClause of
>>> a SELECT statement. Note that the target table must not appear in the
>>> fromlist, unless you intend a self-join (in which case it must appear
>>> with an alias in the fromlist)".
>>>
>>> And as you said, we can not have 'mail m' in the FROM clause. I have
>>> contacted the developers and they will change the statement. I gave then
>>> these 2 examples:
>>>
>>> -------------------------------------------------------------------------------
>>> mailstats=# EXPLAIN update mail SET spamscore = '-5.026' FROM
>>> mail_received mr where mr.mail_id = mail.mail_id AND mr.queue_id =
>>> '1GrxLs-0004N9-I1' and mr.mailhost = '129.240.10.47';
>>> QUERY PLAN
>>> ------------------------------------------------------------------------------------------------------
>>> Nested Loop (cost=0.00..6.54 rows=1 width=57)
>>> -> Index Scan using received_queue_id_index on mail_received mr
>>> (cost=0.00..3.20 rows=1 width=4)
>>> Index Cond: ((queue_id)::text = '1GrxLs-0004N9-I1'::text)
>>> Filter: (mailhost = '129.240.10.47'::inet)
>>> -> Index Scan using mail_pkey on mail (cost=0.00..3.32 rows=1
>>> width=57)
>>> Index Cond: ("outer".mail_id = mail.mail_id)
>>> (6 rows)
>>>
>>> mailstats=# explain update mail SET spamscore = '-5.026' where mail_id
>>> = (select mail_id from mail_received where queue_id = '1GrxLs-0004N9-I1'
>>> and mailhost = '129.240.10.47');
>>> QUERY PLAN
>>> -----------------------------------------------------------------------------------------------------
>>> Index Scan using mail_pkey on mail (cost=3.20..6.52 rows=1 width=57)
>>> Index Cond: (mail_id = $0)
>>> InitPlan
>>> -> Index Scan using received_queue_id_index on mail_received
>>> (cost=0.00..3.20 rows=1 width=4)
>>> Index Cond: ((queue_id)::text = '1GrxLs-0004N9-I1'::text)
>>> Filter: (mailhost = '129.240.10.47'::inet)
>>> (6 rows)
>>> -------------------------------------------------------------------------------
>>>
>>>
>> Look again at the estimated costs of those two query plans. You haven't
>> gained anything there. Try this out:
>>
>> EXPLAIN UPDATE mail
>> SET spamscore = '-5.026'
>> FROM mail_received mr
>> WHERE mail.mail_id = mr.mail_id AND mr.queue_id = '1GrxLs-0004N9-I1' ;
>>
>>
>
> Haven't we?
>
> * In the statement with problems we got this:
> Nested Loop (cost=0.00..932360.78 rows=7184312 width=57)
>
> * In the ones I sent:
> Nested Loop (cost=0.00..6.54 rows=1 width=57)
> Index Scan using mail_pkey on mail (cost=3.20..6.52 rows=1 width=57)
>
> * And in the last one you sent me:
> ------------------------------------------------------
> Nested Loop (cost=0.00..6.53 rows=1 width=57)
> -> Index Scan using received_queue_id_index on mail_received mr
> (cost=0.00..3.20 rows=1 width=4)
> Index Cond: ((queue_id)::text = '1GrxLs-0004N9-I1'::text)
> -> Index Scan using mail_pkey on mail (cost=0.00..3.32 rows=1
> width=57)
> Index Cond: (mail.mail_id = "outer".mail_id)
> (5 rows)
> ------------------------------------------------------
>
> I can not see the different.
>
> regards,
>
Ah, sorry, I was just looking at the two that you sent in your last
message thinking that they were 'old' and 'new', not both 'new'. My bad...

--
erik jones <erik(at)myemma(dot)com>
software development
emma(r)

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message Ron 2006-12-06 20:52:55 Re: File Systems Compared
Previous Message Bruno Wolff III 2006-12-06 20:31:01 Re: File Systems Compared