From: | Mark Kirkwood <mark(dot)kirkwood(at)catalyst(dot)net(dot)nz> |
---|---|
To: | Satoshi Nagayasu <snaga(at)uptime(dot)jp>, Peter Eisentraut <peter_e(at)gmx(dot)net> |
Cc: | Greg Smith <greg(at)2ndQuadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: [PoC] pgstattuple2: block sampling to reduce physical read |
Date: | 2013-10-10 22:32:13 |
Message-ID: | 52572AED.3030609@catalyst.net.nz |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On 11/10/13 11:09, Mark Kirkwood wrote:
> On 16/09/13 16:20, Satoshi Nagayasu wrote:
>> (2013/09/15 11:07), Peter Eisentraut wrote:
>>> On Sat, 2013-09-14 at 16:18 +0900, Satoshi Nagayasu wrote:
>>>> I'm looking forward to seeing more feedback on this approach,
>>>> in terms of design and performance improvement.
>>>> So, I have submitted this for the next CF.
>>> Your patch fails to build:
>>>
>>> pgstattuple.c: In function ‘pgstat_heap_sample’:
>>> pgstattuple.c:737:13: error: ‘SnapshotNow’ undeclared (first use in
>>> this function)
>>> pgstattuple.c:737:13: note: each undeclared identifier is reported
>>> only once for each function it appears in
>> Thanks for checking. Fixed to eliminate SnapshotNow.
>>
> This seems like a cool idea! I took a quick look, and initally
> replicated the sort of improvement you saw:
>
>
> bench=# explain analyze select * from pgstattuple('pgbench_accounts');
> QUERY PLAN
>
> --------------------------------------------------------------------------------
> Function Scan on pgstattuple (cost=0.00..0.01 rows=1 width=72) (actual
> time=786.368..786.369 rows=1 loops=1)
> Total runtime: 786.384 ms
> (2 rows)
>
> bench=# explain analyze select * from pgstattuple2('pgbench_accounts');
> NOTICE: pgstattuple2: SE tuple_count 0.00, tuple_len 0.00,
> dead_tuple_count 0.00, dead_tuple_len 0.00, free_space 0.00
> QUERY PLAN
>
> --------------------------------------------------------------------------------
> Function Scan on pgstattuple2 (cost=0.00..0.01 rows=1 width=72) (actual
> time=12.004..12.005 rows=1 loops=1)
> Total runtime: 12.019 ms
> (2 rows)
>
>
>
> I wondered what sort of difference eliminating caching would make:
>
> $ sudo sysctl -w vm.drop_caches=3
>
> Repeating the above queries:
>
>
> bench=# explain analyze select * from pgstattuple('pgbench_accounts');
> QUERY PLAN
>
> --------------------------------------------------------------------------------
> Function Scan on pgstattuple (cost=0.00..0.01 rows=1 width=72) (actual
> time=9503.774..9503.776 rows=1 loops=1)
> Total runtime: 9504.523 ms
> (2 rows)
>
> bench=# explain analyze select * from pgstattuple2('pgbench_accounts');
> NOTICE: pgstattuple2: SE tuple_count 0.00, tuple_len 0.00,
> dead_tuple_count 0.00, dead_tuple_len 0.00, free_space 0.00
> QUERY PLAN
>
> --------------------------------------------------------------------------------
> Function Scan on pgstattuple2 (cost=0.00..0.01 rows=1 width=72) (actual
> time=12330.630..12330.631 rows=1 loops=1)
> Total runtime: 12331.353 ms
> (2 rows)
>
>
> So the sampling code seems *slower* when the cache is completely cold -
> is that expected? (I have not looked at how the code works yet - I'll
> dive in later if I get a chance)!
>
Quietly replying to myself - looking at the code the sampler does 3000
random page reads... I guess this is slower than 163935 (number of pages
in pgbench_accounts) sequential page reads thanks to os readahead on my
type of disk (WD Velociraptor). Tweaking the number of random reads (i.e
the sample size) down helps - but obviously that can impact estimation
accuracy.
Thinking about this a bit more, I guess the elapsed runtime is not the
*only* theng to consider - the sampling code will cause way less
disruption to the os page cache (3000 pages vs possibly lots more than
3000 for reading an entire ralation).
Thoughts?
Cheers
Mark
From | Date | Subject | |
---|---|---|---|
Next Message | Bruce Momjian | 2013-10-10 22:36:17 | Re: Auto-tuning work_mem and maintenance_work_mem |
Previous Message | Bruce Momjian | 2013-10-10 22:28:06 | Re: Auto-tuning work_mem and maintenance_work_mem |