From: | Josh Berkus <josh(at)agliodbs(dot)com> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgreSQL(dot)org |
Subject: | Re: Oh, this is embarrassing: init file logic is still broken |
Date: | 2015-06-24 21:52:48 |
Message-ID: | 558B26B0.1070704@agliodbs.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On 06/23/2015 04:44 PM, Tom Lane wrote:
> Chasing a problem identified by my Salesforce colleagues led me to the
> conclusion that my commit f3b5565dd ("Use a safer method for determining
> whether relcache init file is stale") is rather borked. It causes
> pg_trigger_tgrelid_tgname_index to be omitted from the relcache init file,
> because that index is not used by any syscache. I had been aware of that
> actually, but considered it a minor issue. It's not so minor though,
> because RelationCacheInitializePhase3 marks that index as nailed for
> performance reasons, and includes it in NUM_CRITICAL_LOCAL_INDEXES.
> That means that load_relcache_init_file *always* decides that the init
> file is busted and silently(!) ignores it. So we're taking a nontrivial
> hit in backend startup speed as of the last set of minor releases.
OK, this is pretty bad in its real performance effects. On a workload
which is dominated by new connection creation, we've lost about 17%
throughput.
To test it, I ran pgbench -s 100 -j 2 -c 6 -r -C -S -T 1200 against a
database which fits in shared_buffers on two different m3.large
instances on AWS (across the network, not on unix sockets). A typical
run on 9.3.6 looks like this:
scaling factor: 100
query mode: simple
number of clients: 6
number of threads: 2
duration: 1200 s
number of transactions actually processed: 252322
tps = 210.267219 (including connections establishing)
tps = 31958.233736 (excluding connections establishing)
statement latencies in milliseconds:
0.002515 \set naccounts 100000 * :scale
0.000963 \setrandom aid 1 :naccounts
19.042859 SELECT abalance FROM pgbench_accounts WHERE aid
= :aid;
Whereas a typical run on 9.3.9 looks like this:
scaling factor: 100
query mode: simple
number of clients: 6
number of threads: 2
duration: 1200 s
number of transactions actually processed: 208180
tps = 173.482259 (including connections establishing)
tps = 31092.866153 (excluding connections establishing)
statement latencies in milliseconds:
0.002518 \set naccounts 100000 * :scale
0.000988 \setrandom aid 1 :naccounts
23.076961 SELECT abalance FROM pgbench_accounts WHERE aid
= :aid;
Numbers are pretty consistent on four runs each on two different
instances (+/- 4%), so I don't think this is Amazon variability we're
seeing. I think the syscache invalidation is really costing us 17%. :-(
--
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com
From | Date | Subject | |
---|---|---|---|
Next Message | Peter Geoghegan | 2015-06-24 21:53:18 | Are we sufficiently clear that jsonb containment is nested? |
Previous Message | Robert Haas | 2015-06-24 21:20:31 | Re: Should we back-patch SSL renegotiation fixes? |