Quick Links

Re: posix_fadvise v22

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	Greg Smith <gsmith(at)gregsmith(dot)com>
Cc:	Robert Haas <robertmhaas(at)gmail(dot)com>, Gregory Stark <stark(at)enterprisedb(dot)com>, ITAGAKI Takahiro <itagaki(dot)takahiro(at)oss(dot)ntt(dot)co(dot)jp>, Postgres <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: posix_fadvise v22
Date:	2009-01-02 20:01:45
Message-ID:	7537.1230926505@sss.pgh.pa.us
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

Greg Smith <gsmith(at)gregsmith(dot)com> writes:
> On Fri, 2 Jan 2009, Tom Lane wrote:
>> ISTM that you *should* be able to see an improvement on even
>> single-spindle systems, due to better overlapping of CPU and I/O effort.

> The earlier synthetic tests I did:
> http://archives.postgresql.org/pgsql-hackers/2008-09/msg01401.php
> Showed a substantial speedup even in the single spindle case on a couple
> of systems, but one didn't really seem to benefit. So we could theorize
> that Robert's test system is more like that one. If someone can help out
> with making a more formal test case showing this in action, I'll dig into
> the details of what's different between that system and the others.

Well, I claim that if you start with a query that's about 50% CPU and
50% I/O effort, you ought to be able to get something approaching 2X
speedup if this patch really works. Consider something like

create function waste_time(int) returns int as $$
begin
for i in 1 .. $1 loop
null;
end loop;
return 1;
end $$ language plpgsql;

select count(waste_time(42)) from very_large_table;

In principle you should be able to adjust the constant so that vmstat
shows about 50% CPU busy, and then enabling fadvise should improve
matters significantly.

Now the above proposed test case is too simple because it will generate
a seqscan, and if the kernel is not completely brain-dead it will not
need any fadvise hinting to do read-ahead. But you should be able to
adapt the idea for whatever indexscan-based test case you are really
using.

Note: on a multi-CPU system you need to take vmstat or top numbers with
a grain of salt, since they might consider "one CPU 50% busy" as
"system only 50/N % busy".

regards, tom lane

In response to

Re: posix_fadvise v22 at 2009-01-02 19:25:47 from Greg Smith

Responses

Re: posix_fadvise v22 at 2009-01-02 20:40:49 from Gregory Stark

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Stephen R. van den Berg	2009-01-02 20:23:13	Re: Significantly larger toast tables on 8.4?
Previous Message	Robert Haas	2009-01-02 19:58:46	Re: Documenting serializable vs snapshot isolation levels