Re: Another weird one with an UPDATE

From: David Griffiths <dgriffiths(at)boats(dot)com>
To: pgsql-performance(at)postgresql(dot)org
Subject: Re: Another weird one with an UPDATE
Date: 2003-10-11 21:35:52
Message-ID: 05a401c3903f$a8524850$6501a8c0@griffiths2
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Sorry - just found the FAQ (
http://jamesthornton.com/postgres/FAQ/faq-english.html#4.22
<http://jamesthornton.com/postgres/FAQ/faq-english.html#4.22> ) on how
IN is very slow.

So I rewrote the query:

\o ./data/temp.txt
SELECT current_timestamp;
UPDATE user_account SET last_name = 'abc'
WHERE EXISTS (SELECT ua.user_account_id FROM user_account ua,
commercial_entity ce, commercial_service cs
WHERE ua.user_account_id = ce.user_account_id AND
ce.commercial_entity_id = cs.commercial_entity_id);
SELECT current_timestamp;
\o

EXISTS is kind of a weird statement, and it doesn't appear to be
identical (the number of rows updated was 72,000 rather than 3500). It
also took 4 minutes to execute.

Is there any way around this other than breaking the query into two? As
in:

pstmt1 = conn.preprareStatement("SELECT ua.user_account_id FROM
user_account ua, commercial_entity ce, commercial_service cs
WHERE ua.user_account_id = ce.user_account_id AND
ce.commercial_entity_id = cs.commercial_entity_id");
rset = pstmt1.executeQuery();
while (rset.next())
{
pstmt2 = conn.prepareStatement("UPDATE user_account SET last_name =
'abc' WHERE user_account_id = ?");
pstmt2.setLong(1, rset.getLong(1));
...
}

Unfort, that will be alot of data moved from Postgres->middle-tier
(Weblogic/Resin), which is inefficient.

Anyone see another solution?

David.

----- Original Message -----
From: David <mailto:dgriffiths(at)boats(dot)com> Griffiths
To: pgsql-performance(at)postgresql(dot)org
<mailto:pgsql-performance(at)postgresql(dot)org>
Sent: Saturday, October 11, 2003 12:44 PM
Subject: [PERFORM] Another weird one with an UPDATE

I am running an update-query to benchmark various databases; the
postgres version is,

UPDATE user_account SET last_name = 'abc'
WHERE user_account_id IN (SELECT user_account_id FROM commercial_entity,
commercial_service WHERE yw_account_id IS NULL
AND commercial_entity.commercial_entity_id =
commercial_service.commercial_entity_id);


The inner query (the select), run by itself, takes about a second. Add
the outer query (the update-portion), and the query dies. The machine
has been vacuum-analzyed. Here is the explain-analyze:

benchtest=# EXPLAIN ANALYZE UPDATE user_account SET last_name = 'abc'
benchtest-# WHERE user_account_id IN (SELECT user_account_id FROM
commercial_entity, commercial_service WHERE yw_account_id IS NULL
benchtest(# AND commercial_entity.commercial_entity_id =
commercial_service.commercial_entity_id);

Seq Scan on user_account (cost=0.00..813608944.88 rows=36242
width=718) (actual time=15696258.98..16311130.29 rows=3075 loops=1)
Filter: (subplan)
SubPlan
-> Materialize (cost=11224.77..11224.77 rows=86952 width=36)
(actual time=0.06..106.40 rows=84831 loops=72483)
-> Merge Join (cost=0.00..11224.77 rows=86952 width=36)
(actual time=0.21..1845.13 rows=85158 loops=1)
Merge Cond: ("outer".commercial_entity_id =
"inner".commercial_entity_id)
-> Index Scan using commercial_entity_pkey on
commercial_entity (cost=0.00..6787.27 rows=77862 width=24) (actual
time=0.06..469.56 rows=78132 loops=1)
Filter: (yw_account_id IS NULL)
-> Index Scan using comm_serv_comm_ent_id_i on
commercial_service (cost=0.00..2952.42 rows=88038 width=12) (actual
time=0.03..444.80 rows=88038 loops=1)
Total runtime: 16332976.21 msec
(10 rows)


Here are the relevant parts of the schema:



USER_ACCOUNT

Column | Type |
Modifiers
-------------------------------+-----------------------------+----------
-------------------
user_account_id | numeric(10,0) | not null
first_name | character varying(100) |
last_name | character varying(100) |
Indexes: user_account_pkey primary key btree (user_account_id),
usr_acc_last_name_i btree (last_name),
Foreign Key constraints: $1 FOREIGN KEY (lang_id) REFERENCES
lang(lang_id) ON UPDATE NO ACTION ON DELETE NO ACTION,
$2 FOREIGN KEY (source_id) REFERENCES
source(source_id) ON UPDATE NO ACTION ON DELETE NO ACTION,
$3 FOREIGN KEY (user_role_id) REFERENCES
user_role(user_role_id) ON UPDATE NO ACTION ON DELETE NO ACTION



COMMERCIAL_ENTITY

Column | Type |
Modifiers
---------------------------+-----------------------------+--------------
-----------------------------------------------
commercial_entity_id | numeric(10,0) | not null
yw_account_id | numeric(10,0) |
Indexes: commercial_entity_pkey primary key btree
(commercial_entity_id),
comm_ent_yw_acc_id_i btree (yw_account_id)
Foreign Key constraints: $1 FOREIGN KEY (source_id) REFERENCES
source(source_id) ON UPDATE NO ACTION ON DELETE NO ACTION,
$2 FOREIGN KEY (user_account_id) REFERENCES
user_account(user_account_id) ON UPDATE NO ACTION ON DELETE NO ACTION



COMMERCIAL_SERVICE

Column | Type | Modifiers
----------------------+---------------+-----------
commercial_entity_id | numeric(10,0) | not null
service_type_id | numeric(10,0) | not null
source_id | numeric(10,0) | not null
Indexes: commercial_service_pkey primary key btree
(commercial_entity_id, service_type_id),
comm_serv_comm_ent_id_i btree (commercial_entity_id),
comm_serv_serv_type_id_i btree (service_type_id),
comm_serv_source_id_i btree (source_id)
Foreign Key constraints: $1 FOREIGN KEY (commercial_entity_id)
REFERENCES commercial_entity(commercial_entity_id) ON UPDATE NO ACTION
ON DELETE NO ACTION,
$2 FOREIGN KEY (source_id) REFERENCES
source(source_id) ON UPDATE NO ACTION ON DELETE NO ACTION,
$3 FOREIGN KEY (service_type_id) REFERENCES
service_type(service_type_id) ON UPDATE NO ACTION ON DELETE NO ACTION



Here is the postgres.conf (or the variables that are not commented out):

tcpip_socket = true
max_connections = 500
shared_buffers = 32768 # min max_connections*2 or 16, 8KB each
wal_buffers = 128 # min 4, typically 8KB each
sort_mem = 4096 # min 64, size in KB
effective_cache_size = 50000 # typically 8KB each

Is it a problem with "IN"?

David

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Stephan Szabo 2003-10-11 22:34:52 Re: Another weird one with an UPDATE
Previous Message David Griffiths 2003-10-11 19:44:36 Another weird one with an UPDATE