Re: Interesting misbehavior of repalloc()

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Gregory Stark <stark(at)enterprisedb(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Interesting misbehavior of repalloc()
Date: 2007-08-12 17:54:04
Message-ID: 28319.1186941244@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

I wrote:
> Gregory Stark <stark(at)enterprisedb(dot)com> writes:
>> We could also only do the realloc-in-place only if there isn't a 4k chunk in
>> the 4k freelist. I'm imagining that usually there wouldn't be.

> Or in general, if there's a free chunk of the right size then copy to
> it, else consider realloc-in-place. Counterintuitive but it might work.
> I'm not sure how often there wouldn't be a free chunk though ...

I experimented with this a bit. Not doing enlarge-in-place when there's
a suitable free chunk turns out to be practically a one-line addition to
AllocSetRealloc, but the question is whether that forty-line block of
code is pulling its weight at all. I added some debug code to log when
the different cases happen, and ran the regression tests. (Which maybe
aren't very representative of real-world usage, but it's the best easy
test I can think of.) What I got was

380 successful enlarge in place
438 blocked by new rule about available chunk
6078 other reallocs of small chunks

The "other reallocs" are ones where one of the existing limitations
prevent us from using realloc-in-place.

The successful enlargements broke down like this:

12 realloc enlarge 16 -> 24
1 realloc enlarge 16 -> 32
1 realloc enlarge 16 -> 40
1 realloc enlarge 16 -> 64
1 realloc enlarge 16 -> 80
139 realloc enlarge 256 -> 512
119 realloc enlarge 512 -> 1024
80 realloc enlarge 1024 -> 2048
26 realloc enlarge 2048 -> 4096

Bearing in mind that the first number is the number of bytes of data
we'd have to copy if we don't enlarge-in-place, we're not saving that
much work. (Cases involving larger chunks are passed off to libc's
realloc(), so there's never anything bigger than 2K of copying at
stake, at least when power-of-2 request sizes are used.)

I drilled down a bit deeper and found that most of the larger realloc's
are coming from just two places: enlargement of StringInfo buffers
(initially 256 bytes) and enlargement of scan.l's literalbuf (initially
128 bytes). I changed the initial allocations to 1K for each of these,
and then the profile of successful realloc-in-place changes to

12 realloc enlarge 16 -> 24
1 realloc enlarge 16 -> 32
1 realloc enlarge 16 -> 40
1 realloc enlarge 16 -> 64
1 realloc enlarge 16 -> 80
81 realloc enlarge 1024 -> 2048
25 realloc enlarge 2048 -> 4096

Here, all of the remaining larger realloc's are happening during CREATE
VIEW operations (while constructing the pg_rewrite rule text), which
probably need not be considered a performance-critical path.

Based on this, I conclude that the realloc-in-place code doesn't pull
its weight. We should just remove it, and increase those penurious
initial allocations in stringinfo.c and scan.l to avoid most of the
use-cases for repalloc in the first place.

Does anyone have any other test cases to suggest? Stuff like pgbench
isn't interesting --- it doesn't cause repalloc to be invoked at all.

regards, tom lane

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2007-08-13 01:20:03 Re: regexp_matches and regexp_split are inconsistent
Previous Message Tom Lane 2007-08-12 17:22:30 Re: Problem with locks