Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> I was actually thinking it would be interesting to oprofile the
> read-only test; see if we can figure out where those slowdowns are
> coming from.
CPU: Intel Core/i7, speed 2262 MHz (estimated)
Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with
a unit mask of 0x00 (No unit mask) count 100000
samples % image name symbol name
3124242 5.7137 postgres s_lock
2555554 4.6737 postgres AllocSetAlloc
2403412 4.3954 postgres GetSnapshotData
1967132 3.5975 postgres SearchCatCache
1872176 3.4239 postgres base_yyparse
1327256 2.4273 postgres hash_search_with_hash_value
1040131 1.9022 postgres _bt_compare
1038976 1.9001 postgres LWLockAcquire
817122 1.4944 postgres MemoryContextAllocZeroAligned
738321 1.3503 postgres core_yylex
622613 1.1386 postgres MemoryContextAlloc
597054 1.0919 postgres PinBuffer
556138 1.0171 postgres ScanKeywordLookup
552318 1.0101 postgres expression_tree_walker
494279 0.9039 postgres LWLockRelease
488628 0.8936 postgres hash_any
472906 0.8649 postgres nocachegetattr
396482 0.7251 postgres grouping_planner
382974 0.7004 postgres LockAcquireExtended
375186 0.6861 postgres AllocSetFree
375072 0.6859 postgres ProcArrayLockRelease
373668 0.6834 postgres new_list
365917 0.6692 postgres fmgr_info_cxt_security
301398 0.5512 postgres ProcArrayLockAcquire
300647 0.5498 postgres LockReleaseAll
292073 0.5341 postgres DirectFunctionCall1Coll
285745 0.5226 postgres MemoryContextAllocZero
284684 0.5206 postgres FunctionCall2Coll
282701 0.5170 postgres SearchSysCache
max_connections = 100
max_pred_locks_per_transaction = 64
shared_buffers = 8GB
maintenance_work_mem = 1GB
checkpoint_segments = 300
checkpoint_timeout = 15min
checkpoint_completion_target = 0.9
wal_writer_delay = 20ms
seq_page_cost = 0.1
random_page_cost = 0.1
cpu_tuple_cost = 0.05
effective_cache_size = 40GB
default_transaction_isolation = '$iso'
pgbench -i -s 100
pgbench -S -M simple -T 300 -c 80 -j 80
transaction type: SELECT only
scaling factor: 100
query mode: simple
number of clients: 80
number of threads: 80
duration: 300 s
number of transactions actually processed: 104391011
tps = 347964.636256 (including connections establishing)
tps = 347976.389034 (excluding connections establishing)
vmstat 1 showed differently this time -- no clue why.
procs --------------memory------------- ---swap-- -----io----
r b swpd free buff cache si so bi bo
---system---- -----cpu------
in cs us sy id wa st
91 0 8196 4189436 203925700 52314492 0 0 0 0
32255 1522807 85 13 1 0 0
92 0 8196 4189404 203925700 52314492 0 0 0 0
32796 1525463 85 14 1 0 0
67 0 8196 4189404 203925700 52314488 0 0 0 0
32343 1527988 85 13 1 0 0
93 0 8196 4189404 203925700 52314488 0 0 0 0
32701 1535827 85 13 1 0 0
-Kevin