From: | yamt(at)mwd(dot)biglobe(dot)ne(dot)jp (YAMAMOTO Takashi) |
---|---|
To: | Kevin(dot)Grittner(at)wicourts(dot)gov |
Cc: | drkp(at)csail(dot)mit(dot)edu, heikki(dot)linnakangas(at)enterprisedb(dot)com, pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: SSI bug? |
Date: | 2011-03-28 03:16:16 |
Message-ID: | 20110328031616.CDAA719CEDF@mail.netbsd.org |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
hi,
>>> (6) Does the application continue to run relatively sanely, or
>>> does it fall over at this point?
>>
>> my application just exits on the error.
>>
>> if i re-run the application without rebooting postgres, it seems
>> that i will get the error sooner than the first run. (but it might
>> be just a matter of luck)
>
> If your application hits this again, could you check pg_stat_activity
> and pg_locks and see if any SIReadLock entries are lingering after
> the owning transaction and all overlapping transactions are
> completed? If anything is lingering between runs of your
> application, it *should* show up in one or the other of these.
this is 71ac48fd9cebd3d2a873635a04df64096c981f73 with your two patches.
this psql session was the only activity to the server at this point.
hoge=# select * from pg_stat_activity;
-[ RECORD 1 ]----+--------------------------------
datid | 16384
datname | hoge
procpid | 7336
usesysid | 10
usename | takashi
application_name | psql
client_addr |
client_hostname |
client_port | -1
backend_start | 2011-03-26 12:28:21.882226+09
xact_start | 2011-03-28 11:55:19.300027+09
query_start | 2011-03-28 11:55:19.300027+09
waiting | f
current_query | select * from pg_stat_activity;
hoge=# select count(*) from pg_locks where mode='SIReadLock';
-[ RECORD 1 ]
count | 7057
hoge=# select locktype,count(*) from pg_locks group by locktype; -[ RECORD 1 ]--------
locktype | virtualxid
count | 1
-[ RECORD 2 ]--------
locktype | relation
count | 1
-[ RECORD 3 ]--------
locktype | tuple
count | 7061
hoge=#
>
>>> (7) The message hint would help pin it down, or a stack trace at
>>> the point of the error would help more. Is it possible to get
>>> either? Looking over the code, it appears that the only places
>>> that SSI could generate that error, it would cancel that
>>> transaction with the hint "You might need to increase
>>> max_pred_locks_per_transaction." and otherwise allow normal
>>> processing.
>>
>> no message hints. these errors are not generated by SSI code,
>> at least directly.
>
> Right, that's because we were using HASH_ENTER instead of
> HASH_ENTER_NULL. I've posted a patch which should correct that.
sure, with your patch it seems that they turned into different ones.
PG_DIAG_SEVERITY: WARNING
PG_DIAG_SQLSTATE: 53200
PG_DIAG_MESSAGE_PRIMARY: out of shared memory
PG_DIAG_SOURCE_FILE: shmem.c
PG_DIAG_SOURCE_LINE: 190
PG_DIAG_SOURCE_FUNCTION: ShmemAlloc
PG_DIAG_SEVERITY: ERROR
PG_DIAG_SQLSTATE: 53200
PG_DIAG_MESSAGE_PRIMARY: out of shared memory
PG_DIAG_MESSAGE_HINT: You might need to increase max_pred_locks_per_transaction.
PG_DIAG_SOURCE_FILE: predicate.c
PG_DIAG_SOURCE_LINE: 2049
PG_DIAG_SOURCE_FUNCTION: CreatePredicateLock
>>> Even with the above information it may be far from clear where
>>> allocations are going past their maximum, since one HTAB could
>>> grab more than its share and starve another which is staying below
>>> its "maximum". I'll take a look at the possibility of adding a
>>> warning or some such when an HTAB expands past its maximum size.
>
> I see from your later post that you are running with this patch. Has
> that reported anything yet?
i got nothing except the following one. (in the server log)
WARNING: hash table "ShmemIndex" has more entries than expected
DETAIL: The maximum was set to 32 on creation.
YAMAMOTO Takashi
>
> Thanks,
>
> -Kevin
From | Date | Subject | |
---|---|---|---|
Next Message | Robert Haas | 2011-03-28 04:01:40 | Re: Comments on SQL/Med objects |
Previous Message | Tom Lane | 2011-03-28 02:43:55 | Re: Lock problem with autovacuum truncating heap |