From: | Dennis Bjorklund <db(at)zigo(dot)dhs(dot)org> |
---|---|
To: | Matthew Schumacher <matt(dot)s(at)aptalaska(dot)net> |
Cc: | josh(at)agliodbs(dot)com, <pgsql-performance(at)postgresql(dot)org> |
Subject: | Re: Performance problems testing with Spamassassin 3.1.0 |
Date: | 2005-07-29 06:48:48 |
Message-ID: | Pine.LNX.4.44.0507290835100.7470-100000@zigo.dhs.org |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-performance |
On Wed, 27 Jul 2005, Matthew Schumacher wrote:
> Then they do this to insert the token:
>
> INSERT INTO bayes_token (
> id,
> token,
> spam_count,
> ham_count,
> atime
> ) VALUES (
> ?,
> ?,
> ?,
> ?,
> ?
> ) ON DUPLICATE KEY
> UPDATE
> spam_count = GREATEST(spam_count + ?, 0),
> ham_count = GREATEST(ham_count + ?, 0),
> atime = GREATEST(atime, ?)
>
> Or update the token:
>
> UPDATE bayes_vars SET
> $token_count_update
> newest_token_age = GREATEST(newest_token_age, ?),
> oldest_token_age = LEAST(oldest_token_age, ?)
> WHERE id = ?
>
>
> I think the reason why the procedure was written for postgres was
> because of the greatest and least statements performing poorly.
How can they perform poorly when they are dead simple? Here are 2
functions that work for the above cases of greatest:
CREATE FUNCTION greatest_int (integer, integer)
RETURNS INTEGER
IMMUTABLE STRICT
AS 'SELECT CASE WHEN $1 < $2 THEN $2 ELSE $1 END;'
LANGUAGE SQL;
CREATE FUNCTION least_int (integer, integer)
RETURNS INTEGER
IMMUTABLE STRICT
AS 'SELECT CASE WHEN $1 < $2 THEN $1 ELSE $2 END;'
LANGUAGE SQL;
and these should be inlined by pg and very fast to execute.
I wrote a function that should do what the insert above does. The update
I've not looked at (I don't know what $token_count_update is) but the
update looks simple enough to just implement the same way in pg as in
mysql.
For the insert or replace case you can probably use this function:
CREATE FUNCTION insert_or_update_token (xid INTEGER,
xtoken BYTEA,
xspam_count INTEGER,
xham_count INTEGER,
xatime INTEGER)
RETURNS VOID AS
$$
BEGIN
LOOP
UPDATE bayes_token
SET spam_count = greatest_int (spam_count + xspam_count, 0),
ham_count = greatest_int (ham_count + xham_count, 0),
atime = greatest_int (atime, xatime)
WHERE id = xid
AND token = xtoken;
IF found THEN
RETURN;
END IF;
BEGIN
INSERT INTO bayes_token VALUES (xid,
xtoken,
xspam_count,
xham_count,
xatime);
RETURN;
EXCEPTION WHEN unique_violation THEN
-- do nothing
END;
END LOOP;
END;
$$
LANGUAGE plpgsql;
It's not really tested so I can't tell if it's faster then what you have.
What it does do is mimic the way you insert values in mysql. It only work
on pg 8.0 and later however since the exception handling was added in 8.0.
--
/Dennis Björklund
From | Date | Subject | |
---|---|---|---|
Next Message | Luke Lonergan | 2005-07-29 07:01:07 | Re: Performance problems testing with Spamassassin |
Previous Message | Gavin Sherry | 2005-07-29 05:58:24 | Re: Performance problems testing with Spamassassin 3.1.0 |