Re: Warm-cache prefetching

From: "Qingqing Zhou" <zhouqq(at)cs(dot)toronto(dot)edu>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Warm-cache prefetching
Date: 2005-12-09 06:02:22
Message-ID: dnb6jk$ikl$1@news.hub.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers


""Luke Lonergan"" <llonergan(at)greenplum(dot)com> wrote
>
>> /* prefetch ahead */
>> __asm__ __volatile__ (
>> "1: prefetchnta 128(%0)\n"
>> : : "r" (s) : "memory" );
>
> I think this kind / grain of prefetch is handled as a compiler
> optimization
> in the latest GNU compilers, and further there are some memory streaming
> operations for the Pentium 4 ISA that are now part of the standard
> compiler
> optimizations done by gcc.
>

Is there any special kind of optimization flag of gcc needed to support
this? I just tried both 2.96 and 4.01 with O2. Unfortunately,
sse_clear_page() encounters a core-dump by 4.0.1 at this line:

__asm__ __volatile__ (" movntdq %%xmm0, %0"::"m"(sse_save[0]) );

So I removed this test (sorry ...). There is no non-trivial difference
AFAICS. The results is attached. I will look into the other parts of your
thread tomorrow,

Regards,
Qingqing

---

*#ll prefp3-*
-rwx------ 1 zhouqq jmgrp 38k Dec 9 00:49 prefp3-296
-rwx------ 1 zhouqq jmgrp 16k Dec 9 00:49 prefp3-401
*#./prefp3-296
2392.975 MHz
clear_page function 'gcc clear_page()' took 27142 cycles per page (172.7
MB/s)
clear_page function 'normal clear_page()' took 27161 cycles per page
(172.6 MB/s)
clear_page function 'mmx clear_page() ' took 17293 cycles per page
(271.1 MB/s)
clear_page function 'gcc clear_page()' took 27174 cycles per page (172.5
MB/s)
clear_page function 'normal clear_page()' took 27142 cycles per page
(172.7 MB/s)
clear_page function 'mmx clear_page() ' took 17291 cycles per page
(271.1 MB/s)

copy_page function 'normal copy_page()' took 18552 cycles per page (252.7
MB/s)
copy_page function 'mmx copy_page() ' took 12511 cycles per page (374.6
MB/s)
copy_page function 'sse copy_page() ' took 12318 cycles per page (380.5
MB/s)
*#./prefp3-401
2392.970 MHz
clear_page function 'gcc clear_page()' took 27120 cycles per page (172.8
MB/s)
clear_page function 'normal clear_page()' took 27151 cycles per page
(172.6 MB/s)
clear_page function 'mmx clear_page() ' took 17295 cycles per page
(271.0 MB/s)
clear_page function 'gcc clear_page()' took 27152 cycles per page (172.6
MB/s)
clear_page function 'normal clear_page()' took 27114 cycles per page
(172.9 MB/s)
clear_page function 'mmx clear_page() ' took 17296 cycles per page
(271.0 MB/s)

copy_page function 'normal copy_page()' took 18586 cycles per page (252.2
MB/s)
copy_page function 'mmx copy_page() ' took 12620 cycles per page (371.4
MB/s)
copy_page function 'sse copy_page() ' took 12698 cycles per page (369.1
MB/s)

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Simon Riggs 2005-12-09 08:25:59 Re: Reducing contention for the LockMgrLock
Previous Message Luke Lonergan 2005-12-09 04:54:05 Re: Warm-cache prefetching