From: | Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com> |
---|---|
To: | Alexander Korotkov <aekorotkov(at)gmail(dot)com> |
Cc: | pgsql-hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Hot standby and GiST page splits (was Re: WIP: Fast GiST index build) |
Date: | 2011-08-01 09:38:13 |
Message-ID: | 4E367405.3000106@enterprisedb.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On 27.07.2011 17:43, Alexander Korotkov wrote:
> OK, thanks. I also found behaviour of GiST(without patch) with streaming
> replication that seems strange for me. On master there are only few
> rightlinks are InvalidBlockNumber while on slave there are a lot of them. I
> hack gevel for getting index structure on slave (accessing tree without
> AccessExclusiveLock).
>
> On master:
> # create table test as (select point(random(),random()) from
> generate_series(1,100000));
> # create index test_idx on test using gist(point);
> # \copy (select gist_tree('test_idx')) to 'tree1r.txt';
>
> On slave:
> # \copy (select gist_tree('test_idx')) to 'tree2r.txt';
>
> In bash:
> # cat tree1r.txt | sed 's/\\n/\n/g'> tree1.txt
> # cat tree2r.txt | sed 's/\\n/\n/g'> tree2.txt
> # diff tree1.txt tree2.txt
>
> 2,89c2,89
> < 1(l:1) blk: 324 numTuple: 129 free: 2472b(69.71%) rightlink:637 (OK)
> < 1(l:2) blk: 242 numTuple: 164 free: 932b(88.58%) rightlink:319
> (OK)
> < 2(l:2) blk: 525 numTuple: 121 free: 2824b(65.39%) rightlink:153
> (OK)
> < 3(l:2) blk: 70 numTuple: 104 free: 3572b(56.23%) rightlink:551
> (OK)
> < 4(l:2) blk: 384 numTuple: 106 free: 3484b(57.30%) rightlink:555
> (OK)
> < 5(l:2) blk: 555 numTuple: 121 free: 2824b(65.39%) rightlink:74
> (OK)
> < 6(l:2) blk: 564 numTuple: 109 free: 3352b(58.92%) rightlink:294
> (OK)
> < 7(l:2) blk: 165 numTuple: 108 free: 3396b(58.38%) rightlink:567
> (OK)
> .....
> ---
>> 1(l:1) blk: 324 numTuple: 129 free: 2472b(69.71%) rightlink:4294967295
> (InvalidBlockNumber)
>> 1(l:2) blk: 242 numTuple: 164 free: 932b(88.58%)
> rightlink:4294967295 (InvalidBlockNumber)
>> 2(l:2) blk: 525 numTuple: 121 free: 2824b(65.39%)
> rightlink:4294967295 (InvalidBlockNumber)
>> 3(l:2) blk: 70 numTuple: 104 free: 3572b(56.23%)
> rightlink:4294967295 (InvalidBlockNumber)
>> 4(l:2) blk: 384 numTuple: 106 free: 3484b(57.30%)
> rightlink:4294967295 (InvalidBlockNumber)
>> 5(l:2) blk: 555 numTuple: 121 free: 2824b(65.39%)
> rightlink:4294967295 (InvalidBlockNumber)
>> 6(l:2) blk: 564 numTuple: 109 free: 3352b(58.92%)
> rightlink:4294967295 (InvalidBlockNumber)
>> 7(l:2) blk: 165 numTuple: 108 free: 3396b(58.38%)
> rightlink:4294967295 (InvalidBlockNumber)
> .....
>
> Isn't it a bug?
Yeah, it sure looks like a bug. I was certain that I had broken this in
the recent changes to GiST handling of page splits, but in fact it has
been like that forever.
The rightlinks are not needed after crash recovery, because all the
downlinks should be there. A scan will find all pages through the
downlinks, and doesn't need to follow any rightlinks. I'm not sure why
we explicitly clear them, it's not like the rightlinks would do any harm
either, but for crash recovery that's harmless.
But a scan during hot standby can see those intermediate states, just
like concurrent scans can on the master. The locking on replay of page
split needs to be fixed, too. At the moment, it locks and writes out
each page separately, so a concurrent scan could "overtake" the WAL
replay while following rightlinks, and miss tuples on the right half.
Attached is a patch for that for 9.1/master. The 9.0 GiST replay code
was quite different, it will require a separate patch.
--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com
Attachment | Content-Type | Size |
---|---|---|
fix-gist-hot-standby-1.patch | text/x-diff | 2.3 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Simon Riggs | 2011-08-01 10:13:54 | Re: Hot standby and GiST page splits (was Re: WIP: Fast GiST index build) |
Previous Message | Simon Riggs | 2011-08-01 09:24:43 | Re: One-Shot Plans |