From: | Myron Scott <lister(at)sacadia(dot)com> |
---|---|
To: | Martijn van Oosterhout <kleptog(at)svana(dot)org> |
Cc: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Luke Lonergan <llonergan(at)greenplum(dot)com>, pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: Support Parallel Query Execution in Executor |
Date: | 2006-04-11 14:47:14 |
Message-ID: | 1144766833.514665.48.camel@bear |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers pgsql-patches |
On Mon, 2006-04-10 at 02:16, Martijn van Oosterhout wrote:
> The appears to be two seperate cases here though, one is to just farm
> out the read request to another process (basically aio), the other is
> to do actual processing there. The latter is obviously for more useful
> but requires a fair bit more infrastructure.
>
I ran some tests to see where time is spent during SeqScans. I did the
following.
tester=# vacuum analyze verbose test;
INFO: vacuuming "public.test"
INFO: "test": found 0 removable, 727960 nonremovable row
versions in 5353 pagesDETAIL: 0 dead row versions cannot be
removed yet.
There were 0 unused item pointers.
0 pages are entirely empty.
CPU 0.18s/0.27u sec elapsed 0.91 sec.
INFO: analyzing "public.test"
INFO: "test": scanned 3000 of 5353 pages, containing 407952
live rows and 0 dead rows; 3000 rows in sample, 727922 estimated
total rows
VACUUM
tester=# select version();
version
-------------------------------------------------------------------------------
PostgreSQL 8.2devel on sparc-sun-solaris2.11, compiled by GCC
gcc (GCC) 3.3.2
(1 row)
tester=# select count(random) from test;
count
--------
727960
(1 row)
With the follow ing dtrace results...
# ./probediff2.d 514607
dtrace: script './probediff2.d' matched 10 probes
CPU ID FUNCTION:NAME
0 46811 ExecEndSeqScan:return scan time 20406
^C
smgrread
641566800
Virtualized - smgrread
439798800
smgread - Call Count
5353
HeapTupleSatisfiesSnapshot
6735471000
Virtualized - HeapTupleSatisfiesSnapshot
3516556800
HeapTupleSatisfiesSnapshot - Call Count
727960
Virtualized - ReadBuffer
558230600
ReadBuffer
864931000
Virtualized - ExecutePlan
7331181400
Virtualized - ExecSeqScan
7331349600
ExecutePlan
20405943000
ExecSeqScan
20406161000
The virtualized times are supposed to be actual time spent on the CPU
with
the time spent in the probe factored out. It seems here that half the
time
in SeqScan is spent time validating the tuples as opposed to 1/10th
doing IO.
I'm not sure that just farming out read IO is going to be all that
helpful
in this situation. That's why I think it's a good idea to create a
slave
process that prefetchs pages and transfers valid ItemPointers to the
master.
There may not be much to be gained on simple SeqScans, however, in
complex
queries that include a SeqScan, you may gain alot by offloading this
work
onto a slave thread. A table with TOAST'ed attributes comes to mind.
The
slave thread could be working away on the rest of the table while the
master
is PG_DETOAST_DATUM'ing the attributes for transmission back to the
client
or additional processing. Am I missing something in this analysis?
I've attached my dtrace script.
Myron Scott
Attachment | Content-Type | Size |
---|---|---|
probediff2.d | text/plain | 1.6 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Martijn van Oosterhout | 2006-04-11 15:17:37 | Re: Suboptimal evaluation of CASE expressions |
Previous Message | Andreas Tille | 2006-04-11 14:43:33 | Suboptimal evaluation of CASE expressions |
From | Date | Subject | |
---|---|---|---|
Next Message | Bruce Momjian | 2006-04-11 16:42:23 | Re: Patch proposal for log_duration |
Previous Message | Dhanaraj M - Sun Microsystems | 2006-04-11 07:04:37 | Re: Patch for bug #2073 (Can't drop sequence when created |