From: | Joe Conway <mail(at)joeconway(dot)com> |
---|---|
To: | Tatsuo Ishii <t-ishii(at)sra(dot)co(dot)jp> |
Cc: | pgsql-hackers(at)postgresql(dot)org, lockhart(at)fourpalms(dot)org, tgl(at)sss(dot)pgh(dot)pa(dot)us |
Subject: | Re: [GENERAL] workaround for lack of REPLACE() function |
Date: | 2002-08-12 21:55:54 |
Message-ID: | 3D582EEA.1000903@joeconway.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general pgsql-hackers pgsql-patches |
Tatsuo Ishii wrote:
>>Now a new function similar to toast_raw_datum_size(), maybe
>>toast_raw_datum_strlen() could be used to get the original string
>>length, whether MB or not, without needing to retrieve and decompress
>>the entire datum.
>>
>>I understand we would either: have to steal another bit from the VARHDR
>>which would reduce the effective size of a valena from 1GB down to .5GB;
>>or we would need to add a byte or two to the VARHDR which is extra
>>per-datum overhead. I'm not sure we would want to do either. But I
>>wanted to toss out the idea while it was fresh on my mind.
>
>
> Interesting idea. I also was thinking about adding some extra
> infomation to text data types such as character set, collation
> etc. for 7.4 or later.
I ran some tests to confirm the theory above regarding overhead;
create table strtest(f1 text);
do 100 times
insert into strtest values('12345....'); -- 100000 characters
loop
do 1000 times
select length(f1) from strtest;
loop
Results:
SQL_ASCII database, new code:
=============================
PG_RETURN_INT32(toast_raw_datum_size(PG_GETARG_DATUM(0)) - VARHDRSZ);
==> 2 seconds
SQL_ASCII database, old code:
=============================
text
*t = PG_GETARG_TEXT_P(0);
PG_RETURN_INT32(VARSIZE(t) - VARHDRSZ);
==> 66 seconds
EUC_JP database, new & old code:
================================
text
*t = PG_GETARG_TEXT_P(0);
PG_RETURN_INT32(pg_mbstrlen_with_len(VARDATA(t),
VARSIZE(t) - VARHDRSZ));
==> 469 seconds
So it appears that, while detoasting is moderately expensive (adds 64
seconds to the test), the call to pg_mbstrlen_with_len() is very
expensive (adds 403 seconds to the test).
Joe
From | Date | Subject | |
---|---|---|---|
Next Message | Peter Darley | 2002-08-12 22:01:25 | Vacuum problem |
Previous Message | Bolden, Thomas | 2002-08-12 20:29:26 | trigger column update |
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2002-08-12 21:57:03 | VACUUM's "No one parent tuple was found", redux |
Previous Message | Rod Taylor | 2002-08-12 20:58:57 | Re: CLUSTER all tables at once? |
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2002-08-12 22:07:12 | Re: Antw: Re: Patch for NetWare support |
Previous Message | Neil Conway | 2002-08-12 20:52:14 | additional ONLY docs |