Re: Vacuum: allow usage of more than 1GB of work mem

From: Pavan Deolasee <pavan(dot)deolasee(at)gmail(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Claudio Freire <klaussfreire(at)gmail(dot)com>, Greg Stark <stark(at)mit(dot)edu>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Jim Nasby <Jim(dot)Nasby(at)bluetreble(dot)com>, PostgreSQL-Dev <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Vacuum: allow usage of more than 1GB of work mem
Date: 2016-09-14 16:19:43
Message-ID: CABOikdO_f7thU9MoicB_NGsGJBxRrjjQ2knjyMhCu-pnk=n7mQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Sep 14, 2016 at 8:47 PM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:

>
>
> I am kind of doubtful about this whole line of investigation because
> we're basically trying pretty hard to fix something that I'm not sure
> is broken. I do agree that, all other things being equal, the TID
> lookups will probably be faster with a bitmap than with a binary
> search, but maybe not if the table is large and the number of dead
> TIDs is small, because cache efficiency is pretty important. But even
> if it's always faster, does TID lookup speed even really matter to
> overall VACUUM performance? Claudio's early results suggest that it
> might, but maybe that's just a question of some optimization that
> hasn't been done yet.

Yeah, I wouldn't worry only about lookup speedup, but if does speeds up,
that's a bonus. But the bitmaps seem to win even for memory consumption. As
theory and experiments both show, at 10% dead tuple ratio, bitmaps will win
handsomely.

In
> theory we could even start with the list of TIDs and switch to the
> bitmap if the TID list becomes larger than the bitmap would have been,
> but I don't know if it's worth the effort.
>
>
Yes, that works too. Or may be even better because we already know the
bitmap size requirements, definitely for the tuples collected so far. We
might need to maintain some more stats to further optimise the
representation, but that seems like unnecessary detailing at this point.

>
> On the other hand, if we switch to the bitmap as the ONLY possible
> representation, we will lose badly when there are scattered updates -
> e.g. 1 deleted tuple every 10 pages.

Sure. I never suggested that. What I'd suggested is to switch back to array
representation once we realise bitmaps are not going to work. But I see
it's probably better the other way round.

> So it seems like we probably
> want to have both options. One tricky part is figuring out how we
> switch between them when memory gets tight; we have to avoid bursting
> above our memory limit while making the switch.

Yes, I was thinking about this problem. Some modelling will be necessary to
ensure that we don't go (much) beyond the maintenance_work_mem while
switching representation, which probably means you need to do that earlier
than necessary.

> For instance, one idea to grow memory usage incrementally would be to
> store dead tuple information separately for each 1GB segment of the
> relation. So we have an array of dead-tuple-representation objects,
> one for every 1GB of the relation. If there are no dead tuples in a
> given 1GB segment, then this pointer can just be NULL. Otherwise, it
> can point to either the bitmap representation (which will take ~4.5MB)
> or it can point to an array of TIDs (which will take 6 bytes/TID).
> That could handle an awfully wide variety of usage patterns
> efficiently; it's basically never worse than what we're doing today,
> and when the dead tuple density is high for any portion of the
> relation it's a lot better.
>
>
Yes seems like a good idea. Another idea that I'd in mind is to use some
sort of indirection map where bitmap for every block or a set of blocks
will either be recorded or not, depending on whether a bit is set for the
range. If the bitmap exists, the indirection map will give out the offset
into the larger bitmap area. Seems similar to what you described.

Thanks,
Pavan

--
Pavan Deolasee http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2016-09-14 16:21:28 Re: Logical Replication WIP
Previous Message Dilip Kumar 2016-09-14 16:04:55 Re: Speed up Clog Access by increasing CLOG buffers