From: | Fujii Masao <masao(dot)fujii(at)gmail(dot)com> |
---|---|
To: | Alexander Korotkov <aekorotkov(at)gmail(dot)com> |
Cc: | Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: HEAD seems to generate larger WAL regarding GIN index |
Date: | 2014-03-17 13:20:00 |
Message-ID: | CAHGQGwGU4_SVFt2EztmwDAkTQKRH3zStF-fbYRKny0Qtha9OVw@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Sun, Mar 16, 2014 at 7:15 AM, Alexander Korotkov
<aekorotkov(at)gmail(dot)com> wrote:
> On Sat, Mar 15, 2014 at 11:27 PM, Heikki Linnakangas
> <hlinnakangas(at)vmware(dot)com> wrote:
>>
>> On 03/15/2014 08:40 PM, Fujii Masao wrote:
>>>
>>> Hi,
>>>
>>> I executed the following statements in HEAD and 9.3, and compared
>>> the size of WAL which were generated by data insertion in GIN index.
>>>
>>> ---------------------
>>> CREATE EXTENSION pg_trgm;
>>> CREATE TABLE hoge (col1 text);
>>> CREATE INDEX hogeidx ON hoge USING gin (col1 gin_trgm_ops) WITH
>>> (FASTUPDATE = off);
>>>
>>> CHECKPOINT;
>>> SELECT pg_switch_xlog();
>>> SELECT pg_switch_xlog();
>>>
>>> SELECT pg_current_xlog_location();
>>> INSERT INTO hoge SELECT 'POSTGRESQL' FROM generate_series(1, 1000000);
>>> SELECT pg_current_xlog_location();
>>> ---------------------
>>>
>>> The results of WAL size are
>>>
>>> 960 MB (9.3)
>>> 2113 MB (HEAD)
>>>
>>> The WAL size in HEAD was more than two times bigger than that in 9.3.
>>> Recently the source code of GIN index has been changed dramatically.
>>> Is the increase in GIN-related WAL intentional or a bug?
>>
>>
>> It was somewhat expected. Updating individual items on the new-format GIN
>> pages requires decompressing and recompressing the page, and the
>> recompressed posting lists need to be WAL-logged. Which generates much
>> larger WAL records.
>>
>> That said, I didn't expect the difference to be quite that big when you're
>> appending to the end of the table. When the new entries go to the end of the
>> posting lists, you only need to recompress and WAL-log the last posting
>> list, which is max 256 bytes long. But I guess that's still a lot more WAL
>> than in the old format.
I ran "pg_xlogdump | grep Gin" and checked the size of GIN-related WAL,
and then found its max seems more than 256B. Am I missing something?
What I observed is
[In HEAD]
At first, the size of GIN-related WAL is gradually increasing up to about 1400B.
rmgr: Gin len (rec/tot): 48/ 80, tx: 1813,
lsn: 0/020020D8, prev 0/02000070, bkp: 0000, desc: Insert item, node:
1663/12945/16441 blkno: 1 isdata: F isleaf: T isdelete: F
rmgr: Gin len (rec/tot): 56/ 88, tx: 1813,
lsn: 0/02002440, prev 0/020023F8, bkp: 0000, desc: Insert item, node:
1663/12945/16441 blkno: 1 isdata: F isleaf: T isdelete: T
rmgr: Gin len (rec/tot): 64/ 96, tx: 1813,
lsn: 0/020044D8, prev 0/02004490, bkp: 0000, desc: Insert item, node:
1663/12945/16441 blkno: 1 isdata: F isleaf: T isdelete: T
...
rmgr: Gin len (rec/tot): 1376/ 1408, tx: 1813,
lsn: 0/02A7EE90, prev 0/02A7E910, bkp: 0000, desc: Insert item, node:
1663/12945/16441 blkno: 2 isdata: F isleaf: T isdelete: T
rmgr: Gin len (rec/tot): 1392/ 1424, tx: 1813,
lsn: 0/02A7F458, prev 0/02A7F410, bkp: 0000, desc: Create posting
tree, node: 1663/12945/16441 blkno: 4
Then the size decreases to about 100B and is gradually increasing
again up to 320B.
rmgr: Gin len (rec/tot): 116/ 148, tx: 1813,
lsn: 0/02A7F9E8, prev 0/02A7F458, bkp: 0000, desc: Insert item, node:
1663/12945/16441 blkno: 4 isdata: T isleaf: T unmodified: 1280 length:
1372 (compressed)
rmgr: Gin len (rec/tot): 40/ 72, tx: 1813,
lsn: 0/02A7FA80, prev 0/02A7F9E8, bkp: 0000, desc: Insert item, node:
1663/12945/16441 blkno: 3 isdata: F isleaf: T isdelete: T
...
rmgr: Gin len (rec/tot): 118/ 150, tx: 1813,
lsn: 0/02A83BA0, prev 0/02A83B58, bkp: 0000, desc: Insert item, node:
1663/12945/16441 blkno: 4 isdata: T isleaf: T unmodified: 1280 length:
1374 (compressed)
...
rmgr: Gin len (rec/tot): 288/ 320, tx: 1813,
lsn: 0/02AEDE28, prev 0/02AEDCE8, bkp: 0000, desc: Insert item, node:
1663/12945/16441 blkno: 14 isdata: T isleaf: T unmodified: 1280
length: 1544 (compressed)
Then the size decreases to 66B and is gradually increasing again up to 320B.
This increase and decrease of WAL size seems to continue.
[In 9.3]
At first, the size of GIN-related WAL is gradually increasing up to about 2700B.
rmgr: Gin len (rec/tot): 52/ 84, tx: 1812,
lsn: 0/02000430, prev 0/020003D8, bkp: 0000, desc: Insert item, node:
1663/12896/16441 blkno: 1 offset: 11 nitem: 1 isdata: F isleaf T
isdelete F updateBlkno:4294967295
rmgr: Gin len (rec/tot): 60/ 92, tx: 1812,
lsn: 0/020004D0, prev 0/02000488, bkp: 0000, desc: Insert item, node:
1663/12896/16441 blkno: 1 offset: 1 nitem: 1 isdata: F isleaf T
isdelete T updateBlkno:4294967295
...
rmgr: Gin len (rec/tot): 2740/ 2772, tx: 1812,
lsn: 0/026D1670, prev 0/026D0B98, bkp: 0000, desc: Insert item, node:
1663/12896/16441 blkno: 5 offset: 2 nitem: 1 isdata: F isleaf T
isdelete T updateBlkno:4294967295
rmgr: Gin len (rec/tot): 2714/ 2746, tx: 1812,
lsn: 0/026D21A8, prev 0/026D2160, bkp: 0000, desc: Create posting
tree, node: 1663/12896/16441 blkno: 6
The size decreases to 66B and then is never changed.
rmgr: Gin len (rec/tot): 34/ 66, tx: 1812,
lsn: 0/026D9F00, prev 0/026D9EB8, bkp: 0000, desc: Insert item, node:
1663/12896/16441 blkno: 6 offset: 451 nitem: 1 isdata: T isleaf T
isdelete F updateBlkno:4294967295
rmgr: Gin len (rec/tot): 34/ 66, tx: 1812,
lsn: 0/026D9F48, prev 0/026D9F00, bkp: 0000, desc: Insert item, node:
1663/12896/16441 blkno: 7 offset: 451 nitem: 1 isdata: T isleaf T
isdelete F updateBlkno:4294967295
rmgr: Gin len (rec/tot): 34/ 66, tx: 1812,
lsn: 0/026D9F90, prev 0/026D9F48, bkp: 0000, desc: Insert item, node:
1663/12896/16441 blkno: 8 offset: 451 nitem: 1 isdata: T isleaf T
isdelete F updateBlkno:4294967295
...
This difference in GIN-related WAL size seems to cause HEAD to generate more
than two times bigger WAL. Unfortunately the gap of WAL size would be
continuously increasing :(
>>
>> That could be optimized, but I figured we can live with it, thanks to the
>> fastupdate feature. Fastupdate allows amortizing that cost over several
>> insertions. But of course, you explicitly disabled that...
>
>
> Let me know if you want me to write patch addressing this issue.
Yeah, I really want you to address this problem! That's definitely useful
for every users disabling FASTUPDATE option for some reasons.
Regards,
--
Fujii Masao
From | Date | Subject | |
---|---|---|---|
Next Message | Andres Freund | 2014-03-17 13:20:03 | Re: Changeset Extraction v7.9.1 |
Previous Message | Andres Freund | 2014-03-17 13:16:38 | Re: Changeset Extraction v7.9.1 |