Re: Vacuum: allow usage of more than 1GB of work mem

From: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
To: Claudio Freire <klaussfreire(at)gmail(dot)com>
Cc: Anastasia Lubennikova <a(dot)lubennikova(at)postgrespro(dot)ru>, Anastasia Lubennikova <lubennikovaav(at)gmail(dot)com>, PostgreSQL-Dev <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Vacuum: allow usage of more than 1GB of work mem
Date: 2017-02-01 20:47:16
Message-ID: CAD21AoDpM5qPOhg0kwCGZtpdGW92Ljq3KOXWhU2a36-azzrzJA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Jan 31, 2017 at 3:05 AM, Claudio Freire <klaussfreire(at)gmail(dot)com> wrote:
> On Mon, Jan 30, 2017 at 5:51 AM, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
>> ----
>> * We are willing to use at most maintenance_work_mem (or perhaps
>> * autovacuum_work_mem) memory space to keep track of dead tuples. We
>> * initially allocate an array of TIDs of that size, with an upper limit that
>> * depends on table size (this limit ensures we don't allocate a huge area
>> * uselessly for vacuuming small tables). If the array threatens to overflow,
>>
>> I think that we need to update the above paragraph comment at top of
>> vacuumlazy.c file.
>
> Indeed, I missed that one. Fixing.
>
>>
>> ----
>> + numtuples = Max(numtuples,
>> MaxHeapTuplesPerPage);
>> + numtuples = Min(numtuples, INT_MAX / 2);
>> + numtuples = Min(numtuples, 2 *
>> pseg->max_dead_tuples);
>> + numtuples = Min(numtuples,
>> MaxAllocSize / sizeof(ItemPointerData));
>> + seg->dt_tids = (ItemPointer)
>> palloc(sizeof(ItemPointerData) * numtuples);
>>
>> Why numtuples is limited to "INT_MAX / 2" but not INT_MAX?
>
> I forgot to mention this one in the OP.
>
> Googling around, I found out some implemetations of bsearch break with
> array sizes beyond INT_MAX/2 [1] (they'd overflow when computing the
> midpoint).
>
> Before this patch, this bsearch call had no way of reaching that size.
> An initial version of the patch (the one that allocated a big array
> with huge allocation) could reach that point, though, so I reduced the
> limit to play it safe. This latest version is back to the starting
> point, since it cannot allocate segments bigger than 1GB, but I opted
> to keep playing it safe and leave the reduced limit just in case.
>

Thanks, I understood.

>> ----
>> @@ -1376,35 +1411,43 @@ lazy_vacuum_heap(Relation onerel, LVRelStats
>> *vacrelstats)
>> pg_rusage_init(&ru0);
>> npages = 0;
>>
>> - tupindex = 0;
>> - while (tupindex < vacrelstats->num_dead_tuples)
>> + segindex = 0;
>> + tottuples = 0;
>> + for (segindex = tupindex = 0; segindex <=
>> vacrelstats->dead_tuples.last_seg; tupindex = 0, segindex++)
>> {
>> - BlockNumber tblk;
>> - Buffer buf;
>> - Page page;
>> - Size freespace;
>>
>> This is a minute thing but tupindex can be define inside of for loop.
>
> Right, changing.
>
>>
>> ----
>> @@ -1129,10 +1159,13 @@ lazy_scan_heap(Relation onerel, int options,
>> LVRelStats *vacrelstats,
>> * instead of doing a second scan.
>> */
>> if (nindexes == 0 &&
>> - vacrelstats->num_dead_tuples > 0)
>> + vacrelstats->dead_tuples.num_entries > 0)
>> {
>> /* Remove tuples from heap */
>> - lazy_vacuum_page(onerel, blkno, buf, 0, vacrelstats, &vmbuffer);
>> + Assert(vacrelstats->dead_tuples.last_seg == 0); /*
>> Should not need more
>> + *
>> than one segment per
>> + * page */
>>
>> I'm not sure we need to add Assert() here but it seems to me that the
>> comment and code is not properly correspond and the comment for
>> Assert() should be wrote above of Assert() line.
>
> Well, that assert is the one that found the second bug in
> lazy_clear_dead_tuples, so clearly it's not without merit.
>
> I'll rearrange the comments as you ask though.
>
>
> Updated and rebased v7 attached.
>
>
> [1] https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=776671

Thank you for updating the patch.

Whole patch looks good to me except for the following one comment.
This is the final comment from me.

/*
* lazy_tid_reaped() -- is a particular tid deletable?
*
* This has the right signature to be an IndexBulkDeleteCallback.
*
* Assumes dead_tuples array is in sorted order.
*/
static bool
lazy_tid_reaped(ItemPointer itemptr, void *state)
{
LVRelStats *vacrelstats = (LVRelStats *) state;

You might want to update the comment of lazy_tid_reaped() as well.

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Pavel Stehule 2017-02-01 20:59:10 Re: PoC plpgsql - possibility to force custom or generic plan
Previous Message Tom Lane 2017-02-01 20:36:19 Re: Patch: Write Amplification Reduction Method (WARM)