From: | ITAGAKI Takahiro <itagaki(dot)takahiro(at)oss(dot)ntt(dot)co(dot)jp> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org |
Cc: | Katsuhiko Okano <okano(dot)katsuhiko(at)oss(dot)ntt(dot)co(dot)jp> |
Subject: | Re: CSStorm occurred again by postgreSQL8.2 |
Date: | 2006-08-07 04:36:56 |
Message-ID: | 20060807114525.5266.ITAGAKI.TAKAHIRO@oss.ntt.co.jp |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers pgsql-patches |
Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> > It does not solve, even if it increases the number of NUM_SUBTRANS_BUFFERS.
> > The problem was only postponed.
>
> Can you provide a reproducible test case for this?
This is the reproducible test case:
- Occurs on both 8.1.4 and HEAD.
- On smp machine. I used dual opterons.
CSStrom becomes worse on dual xeon with hyper-threading.
- Tuning parameters are default. Whole data are cached in shared buffers.
(shared_buffers=32MB, data of pgbench (scale=1) are less than 15MB.)
- Using custom pgbench. One client doing UPDATE with indexscan
and multiple clients doing SELECT with seqscan/indexscan.
$ pgbench -i
$ pgbench -n -c 1 -t 100000 -f cs_update.sql &
$ pgbench -n -c 50 -t 100000 -f cs_indexscan.sql &
$ pgbench -n -c 35 -t 100000 -f cs_seqscan.sql &
(The scripts are attached at end of this message.)
In above workload, context switches are 2000-10000/sec and cpu usage is
user=100%. Then, start a long open transaction on another connection.
$ psql
# begin; -- Long open transaction
After a lapse of 30-60 seconds, context switches become 50000/sec over
(120000 over on xeons) and cpu usage is user=66% / sys=21% / idle=13%.
If we increase the frequency of UPDATE, the duration becomes shorter.
This is a human-induced workload, but I can see the same condition in
TPC-W -- even though it is a benchmark. TPC-W requires full-text search
and it is implementd using "LIKE %foo%" in my implementation (DBT-1, too).
Also, it requires periodical aggregations. They might behave as long
transactions.
The cause seems to be a lock contention. The number of locks on
SubtransControlLock and SubTransBuffer are significantly increased
by comparison with BufMappingLocks.
# Before starting a long transaction.
kind | lwlock | sh_call | sh_wait | ex_call | ex_wait
------+---------------------+----------+---------+---------+---------
13 | SubtransControlLock | 28716 | 2 | 54 | 0
22 | BufMappingLock | 11637884 | 0 | 2492 | 0
27 | SubTransBuffer | 0 | 0 | 11 | 0
# After
kind | lwlock | sh_call | sh_wait | ex_call | ex_wait
------+---------------------+----------+---------+---------+---------
13 | SubtransControlLock | 4139111 | 65059 | 3926691 | 390838
22 | BufMappingLock | 32348073 | 0 | 2509 | 0
27 | SubTransBuffer | 939646 | 960341 | 1419152 | 61
The invokers of SubTrans module are two SubTransGetTopmostTransaction()
in HeapTupleSatisfiesSnapshot(). When I disabled the calls, CSStorm did
not occur. SubTransGetTopmostTransaction returns the argument without
change when we don't use SAVEPOINTs.
If we optimize for non-subtransactions, we can avoid to lock SubTrans
for check visiblities of tuples inserted by top transactions.
If we want to resolve the probmen fundamentally, we might have to
improve SubTrans using a better buffer management algorithm or so.
Do you have any idea to avoid such a problem?
-- cs_update.sql
\set naccounts 100000 * :tps
\setrandom aid 1 :naccounts
\setrandom delta -5000 5000
UPDATE accounts SET abalance = abalance + :delta WHERE aid = :aid;
SELECT pg_sleep(0.1);
-- cs_seqscan.sql
\set naccounts 100000 * :tps
\setrandom aid 1 :naccounts
SELECT abalance FROM accounts WHERE aid::int8 = :aid; -- cast to force seqscan
-- cs_indexscan.sql
\set naccounts 100000 * :tps
\setrandom aid 1 :naccounts
SELECT abalance FROM accounts WHERE aid = :aid;
Regards,
---
ITAGAKI Takahiro
NTT Open Source Software Center
From | Date | Subject | |
---|---|---|---|
Next Message | Joshua D. Drake | 2006-08-07 07:33:56 | Re: 8.2 features status |
Previous Message | Tom Lane | 2006-08-07 03:53:43 | Re: 'startup waiting' status message |
From | Date | Subject | |
---|---|---|---|
Next Message | Hiroshi Saito | 2006-08-07 07:52:21 | Re: Fixed definition of complicated errcode of ms_vc8 |
Previous Message | Tom Lane | 2006-08-07 03:50:06 | Re: LWLock statistics collector |