Re: Vacuum: allow usage of more than 1GB of work mem

From: Claudio Freire <klaussfreire(at)gmail(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Pavan Deolasee <pavan(dot)deolasee(at)gmail(dot)com>, Greg Stark <stark(at)mit(dot)edu>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Jim Nasby <Jim(dot)Nasby(at)bluetreble(dot)com>, PostgreSQL-Dev <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Vacuum: allow usage of more than 1GB of work mem
Date: 2016-09-13 19:31:51
Message-ID: CAGTBQpa3Usv+J56h_L37eLY1NvQ39Nv-=KFwUKeT+hO4Fw+v7w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Sep 13, 2016 at 4:06 PM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> On Tue, Sep 13, 2016 at 2:59 PM, Claudio Freire <klaussfreire(at)gmail(dot)com> wrote:
>> I've finished writing that patch, I'm in the process of testing its CPU impact.
>>
>> First test seemed to hint at a 40% increase in CPU usage, which seems
>> rather steep compared to what I expected, so I'm trying to rule out
>> some methodology error here.
>
> Hmm, wow. That's pretty steep. Maybe lazy_heap_reaptid() is hotter
> than I think it is, but even if it accounts for 10% of total CPU usage
> within a vacuum, which seems like an awful lot, you'd have to make it
> 4x as expensive, which also seems like an awful lot.

IIRC perf top reported a combined 45% between layz_heap_reaptid +
vac_cmp_itemptr (after patching).

vac_cmp_itemptr was around 15% on its own

Debug build of couse (I need the assertions and the debug symbols),
I'll retest with optimizations once debug tests make sense.

>>> For example, we could keep a bitmap with one bit per K
>>> pages. If the bit is set, there is at least 1 dead tuple on that
>>> page; if clear, there are none. When we see an index tuple, we
>>> consult the bitmap to determine whether we need to search the TID
>>> list. We select K to be the smallest power of 2 such that the bitmap
>>> uses less memory than some threshold, perhaps 64kB.
>>
>> I've been pondering something like that, but that's an optimization
>> that's quite orthogonal to the multiarray stuff.
>
> Sure, but if this really does increase CPU time, it'd be reasonable to
> do something to decrease it again in order to get the other benefits
> of this patch - i.e. increasing the maintenance_work_mem limit while
> reducing the chances that overallocation will cause OOM.

I was hoping it wouldn't regress performance so much. I'd rather
micro-optimize the multiarray implementation until it doesn't and then
think of orthogonal optimizations.

>>> Assuming that
>>> updates and deletes to the table have some locality, we should be able
>>> to skip a large percentage of the TID searches with a probe into this
>>> very compact bitmap.
>>
>> I don't think you can assume locality
>
> Really? If you have a 1TB table, how many 2MB ranges of that table do
> you think will contain dead tuples for a typical vacuum? I think most
> tables of that size are going to be mostly static, and the all-visible
> and all-frozen bits are going to be mostly set. You *could* have
> something like a pgbench-type workload that does scattered updates
> across the entire table, but that's going to perform pretty poorly
> because you'll constantly be updating blocks that have to be pulled in
> from disk.

I have a few dozen of those in my biggest database. They do updates
and deletes all over the place and, even if they were few, they're
scattered almost uniformly.

Thing is, I think we really need to not worsen that case, which seems
rather common (almost any OLTP with a big enough user base, or a K-V
type of table, or TOAST tables).

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2016-09-13 19:37:22 Re: kqueue
Previous Message Robert Haas 2016-09-13 19:06:03 Re: Vacuum: allow usage of more than 1GB of work mem