From: | Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> |
---|---|
To: | Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> |
Cc: | Fujii Masao <masao(dot)fujii(at)gmail(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Jim Nasby <Jim(dot)Nasby(at)bluetreble(dot)com>, Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, Petr Jelinek <petr(at)2ndquadrant(dot)com>, Bruce Momjian <bruce(at)momjian(dot)us>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Greg Stark <stark(at)mit(dot)edu>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Jeff Janes <jeff(dot)janes(at)gmail(dot)com> |
Subject: | Re: Freeze avoidance of very large table. |
Date: | 2015-10-29 16:26:06 |
Message-ID: | CAD21AoCadYViiR6Q-fHKXxT4PVUGiVxam9G-6tmFVx4MxxQ0Ug@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Wed, Oct 28, 2015 at 12:58 PM, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> On Sat, Oct 24, 2015 at 2:24 PM, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
> wrote:
>>
>> On Sat, Oct 24, 2015 at 10:59 AM, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
>> wrote:
>> >
>> > I think we can display information about relallfrozen it in
>> > pg_stat_*_tables
>> > as suggested by you. It doesn't make much sense to keep it in pg_class
>> > unless we have some usecase for the same.
>> >
>>
>> I'm thinking a bit about implementing the read-only table that is
>> restricted to update/delete and is ensured that whole table is frozen,
>> if this feature is committed.
>> The value of relallfrozen might be useful for such feature.
>>
Thank you for reviewing!
> If we need this for read-only table feature, then better lets add that
> after discussing the design of that feature. It doesn't seem to be
> advisable to have an extra field in system table which we might
> need in yet not completely-discussed feature.
I changed it so that the number of frozen pages is stored in
pg_stat_all_tables as statistics information.
Also, the tests related to counting all-visible bit and skipping
vacuum are added to visibility map test, and the test related to
counting all-frozen is added to stats collector test.
Attached updated v20 patch.
> Review Comments:
> -------------------------------
> 1.
> /*
> - * Find buffer to insert this tuple into. If the page is all visible,
> - * this will also pin
> the requisite visibility map page.
> + * Find buffer to insert this tuple into. If the page is all
> visible
> + * or all frozen, this will also pin the requisite visibility map and
> + * frozen map page.
>
> */
> buffer = RelationGetBufferForTuple(relation, heaptup->t_len,
>
> InvalidBuffer, options, bistate,
>
>
> I think it is sufficient to say in the end 'visibility map page'.
> Let's not include 'frozen map page'.
Fixed.
>
> 2.
> + * corresponding page has been completely frozen, so the visibility map is
> also
> + * used for anti-wraparound
> vacuum, even if freezing tuples is required.
>
> /all tuple/all tuples
> /freezing tuples/freezing of tuples
Fixed.
> 3.
> - * Are all tuples on heapBlk visible to all, according to the visibility
> map?
> + * Are all tuples on heapBlk
> visible or frozen to all, according to the visibility map?
>
> I think it is better to modify the above statement as:
> Are all tuples on heapBlk visible to all or are marked as frozen, according
> to the visibility map?
Fixed.
> 4.
> + * releasing *buf after it's done testing and setting bits, and must set
> flags
> + * which indicates what flag
> we want to test.
>
> Here are you talking about the flags passed to visibilitymap_set(), if
> yes, then above comment is not clear, how about:
>
> and must pass flags
> for which it needs to check the value in visibility map.
Fixed.
> 5.
> + * both how many pages we skipped according to all-frozen bit of visibility
> + * map and how many
> pages we freeze page, so we can update relfrozenxid if
>
> In above sentence word 'page' after freeze sounds redundant.
> /we freeze page/we freeze
>
> Another suggestion:
> /sum of them/sum of two
Fixed.
> 6.
> + * This block is at least all-visible according to visibility map.
> +
> * We check whehter this block is all-frozen or not, to skip to
>
> whether is mis-spelled
Fixed.
> 7.
> + * If we froze any tuples or any tuples are already frozen,
> + * mark the buffer
> dirty, and write a WAL record recording the changes.
>
> Here, I think WAL record is written only when we mark some
> tuple/'s as frozen not if we they are already frozen,
> so in that regard, I think above comment is wrong.
It's wrong.
Fixed.
> 8.
> + /*
> + * We cant't allow upgrading with link mode between 9.5 or before and 9.6
> or later,
> + *
> because the format of visibility map has been changed on version 9.6.
> + */
>
>
> a. /cant't/can't
> b. changed on version 9.6/changed in version 9.6
> b. Won't such a change needs to be updated in pg_upgrade
> documentation (Notes Section)?
Fixed.
And updated document.
> 9.
> @@ -180,6 +181,13 @@ transfer_single_new_db(pageCnvCtx *pageConverter,
>
> new_cluster.controldata.cat_ver >= VISIBILITY_MAP_CRASHSAFE_CAT_VER)
> vm_crashsafe_match = false;
>
> +
> /*
> + * Do we need to rewrite visibilitymap?
> + */
> + if (old_cluster.controldata.cat_ver <
> VISIBILITY_MAP_FROZEN_BIT_CAT_VER &&
> + new_cluster.controldata.cat_ver >=
> VISIBILITY_MAP_FROZEN_BIT_CAT_VER)
> + vm_rewrite_needed = true;
>
> ..
>
> @@ -276,7 +285,15 @@ transfer_relfile(pageCnvCtx *pageConverter, FileNameMap
> *map,
> {
>
> pg_log(PG_VERBOSE, "copying \"%s\" to \"%s\"\n", old_file, new_file);
>
> - if ((msg =
> copyAndUpdateFile(pageConverter, old_file, new_file, true)) != NULL)
> + /*
> +
> * Do we need to rewrite visibilitymap?
> + */
> + if (strcmp
> (type_suffix, "_vm") == 0 &&
> + old_cluster.controldata.cat_ver <
> VISIBILITY_MAP_FROZEN_BIT_CAT_VER &&
> + new_cluster.controldata.cat_ver >=
> VISIBILITY_MAP_FROZEN_BIT_CAT_VER)
> + rewrite_vm = true;
>
> Instead of doing re-check in transfer_relfile(), I think it is better
> to pass an additional parameter in this function.
I agree.
Fixed.
>
> 10.
> You have mentioned up-thread that, you have changed the patch so that
> PageClearAllVisible clear both bits, can you please point me to this
> change?
> Basically after applying the patch, I see below code in bufpage.h:
> #define PageClearAllVisible(page) \
> (((PageHeader) (page))->pd_flags &= ~PD_ALL_VISIBLE)
>
> Don't we need to clear the PD_ALL_FROZEN separately?
Previous patch is wrong. PageClearAllVisible() should be;
#define PageClearAllVisible(page) \
(((PageHeader) (page))->pd_flags &= ~(PD_ALL_VISIBLE | PD_ALL_FROZEN))
The all-frozen flag/bit is cleared only by modifying page, so it is
impossible that only all-frozen flags/bit is cleared.
The clearing of all-visible flag/bit also means that the page has some
garbage, and is needed to vacuum.
Regards,
--
Masahiko Sawada
Attachment | Content-Type | Size |
---|---|---|
000_add_frozen_bit_into_visibilitymap_v20.patch | text/x-patch | 83.7 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Colin 't Hart | 2015-10-29 16:31:04 | Did the "Full-text search in PostgreSQL in milliseconds" patches land? |
Previous Message | Jeff Janes | 2015-10-29 16:22:15 | Re: pg_dump |