| From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> | 
|---|---|
| To: | Mark Mielke <mark(at)mark(dot)mielke(dot)cc> | 
| Cc: | Jeff Davis <pgsql(at)j-davis(dot)com>, Michał Zaborowski <michal(dot)zaborowski(at)gmail(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Ron Mayer <rm_pg(at)cheapcomplexdevices(dot)com>, pgsql-hackers(at)postgresql(dot)org | 
| Subject: | Re: Sorting Improvements for 8.4 | 
| Date: | 2007-12-20 01:02:30 | 
| Message-ID: | 1549.1198112550@sss.pgh.pa.us | 
| Views: | Whole Thread | Raw Message | Download mbox | Resend email | 
| Thread: | |
| Lists: | pgsql-hackers | 
Mark Mielke <mark(at)mark(dot)mielke(dot)cc> writes:
> Jeff Davis wrote:
>> Also, there is probably a lot of memory copying going on, and that
>> probably destroys a lot of the effectiveness of L2 caching. When L2
>> caching is ineffective, the CPU spends a lot of time just waiting on
>> memory. In that case, it's better to have P threads of execution all
>> waiting on memory operations in parallel.
>> 
> I didn't consider the high throughput / high latency effect. This could 
> be true if the CPU prefetch isn't effective enough.
Note that if this is the argument, then there's a ceiling on the speedup
you can expect to get: it's just the extent of mismatch between the CPU
and memory speeds.  I can believe that suitable test cases would show
2X improvement for 2 threads, but it doesn't follow that you will get
10X improvement with 10 threads, or even 4X with 4.
regards, tom lane
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Gregory Stark | 2007-12-20 01:26:13 | Re: Sorting Improvements for 8.4 | 
| Previous Message | Jeff Davis | 2007-12-20 01:01:28 | Re: Sorting Improvements for 8.4 |