From: | a_ogawa <a_ogawa(at)hi-ho(dot)ne(dot)jp> |
---|---|
To: | pgsql-patches(at)postgresql(dot)org |
Subject: | Cache last known per-tuple offsets to speed long tuple access |
Date: | 2004-10-30 14:05:19 |
Message-ID: | PIEMIKOOMKNIJLLLBCBBGELGCCAA.a_ogawa@hi-ho.ne.jp |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-patches |
I made a patch for "Cache last known per-tuple offsets to speed
long tuple access" that is in TODO list.
This problem was discussed on hackers-list as "Terrible performance
on wide selects".
The point of this problem is nocachegetattr() used from ExecEvalVar().
If tuple has many columns, and it has varlen column or null data,
time spent in nocachegetattr() is O(N^2) in the number of fields.
I referred URL below for implementation.
http://archives.postgresql.org/pgsql-performance/2003-01/msg00262.php
The point of this patch is as follows:
(1)heap_deformtuple_incr() is added.
This function can extract attributes of tupple, incrementally.
(2)The cache which keeps the result of heap_deformtuple_incr(),
is added inside TupleTableSlot.
(3)In ExecEvalVar(), heap_deformtuple_incr() is used in place of
nocachegetattr(). This would reduce the time from O(N^2) to O(N).
In order to measure the effect, I executed the test below.
-------------------
Table has 15,000tuples, 200columns. All data type is text.
Table name is aaa. Column name is t001...t200.
Executed SQL is,
select t100, t110, t120, t130, t140, t150,
t160, t170, t180, t190, t200
from aaa;
The profile result of original code is as follows.
-------------------
Each sample counts as 0.01 seconds.
% cumulative self self total
time seconds seconds calls s/call s/call name
70.05 1.31 1.31 163846 0.00 0.00 nocachegetattr
8.02 1.46 0.15 163840 0.00 0.00 FunctionCall3
1.87 1.50 0.04 397763 0.00 0.00 AllocSetFreeIndex
1.60 1.52 0.03 163840 0.00 0.00 ExecEvalVar
1.34 1.55 0.03 200414 0.00 0.00 AllocSetAlloc
The profile result after the patch applying is as follows.
-------------------
Each sample counts as 0.01 seconds.
% cumulative self self total
time seconds seconds calls ms/call ms/call name
39.73 0.29 0.29 180224 0.00 0.00 heap_deformtuple_incr
9.59 0.36 0.07 163840 0.00 0.00 FunctionCall3
6.85 0.41 0.05 16384 0.00 0.02 ExecTargetList
5.48 0.45 0.04 23477 0.00 0.00 hash_any
4.11 0.48 0.03 200414 0.00 0.00 AllocSetAlloc
Regards,
--- Atsushi Ogawa (a_ogawa(at)hi-ho(dot)ne(dot)jp)
Attachment | Content-Type | Size |
---|---|---|
deformtuple_cache.patch | application/octet-stream | 15.2 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Magnus Hagander | 2004-10-30 15:12:16 | Re: Win32 open items |
Previous Message | Jan Wieck | 2004-10-30 13:40:39 | Re: [PATCHES] ARC Memory Usage analysis |