From: | Daniel Gustafsson <daniel(at)yesql(dot)se> |
---|---|
To: | Michael Paquier <michael(at)paquier(dot)xyz> |
Cc: | Andres Freund <andres(at)anarazel(dot)de>, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Ought to use heap_multi_insert() for pg_attribute/depend insertions? |
Date: | 2020-07-29 21:34:07 |
Message-ID: | AE62A5AA-500F-405B-BBE6-FBA4A99CB17A@yesql.se |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
> On 29 Jul 2020, at 10:32, Michael Paquier <michael(at)paquier(dot)xyz> wrote:
>
> On Wed, Jul 01, 2020 at 06:24:18PM +0900, Michael Paquier wrote:
>> I am not sure either, still it looks worth thinking about it.
>> Attached is a rebased version of the last patch. What this currently
>> holds is just the switch to heap_multi_insert() for the three catalogs
>> pg_attribute, pg_depend and pg_shdepend. One point that looks worth
>> debating about is to how much to cap the data inserted at once. This
>> uses 64kB for all three, with a number of slots chosen based on the
>> size of each record, similarly to what we do for COPY.
>
> I got an extra round of review done for this patch.
Thanks!
> While on it, I have done some measurements to see the difference in
> WAL produced and get an idea of the gain.
> On HEAD, with a table that has 1300 attributes, this leads to 563kB of
> WAL produced. With the patch, we get down to 505kB. That's an
> extreme case of course, but that's nice a nice gain.
>
> A second test, after creating a database from a template that has
> roughly 10k entries in pg_shdepend (10k empty tables actually), showed
> a reduction from 2158kB to 1762kB in WAL.
Extreme cases for sure, but more importantly, there should be no cases when
this would cause an increase wrt the status quo.
> Finally comes the third catalog, pg_depend, and there is one thing
> that makes me itching about this part. We do a lot of useless work
> for the allocation and destruction of the slots when there are pinned
> dependencies, and there can be a lot of them. Just by running the
> main regression tests, it is easy to see that in 0003 we still do a
> lot of calls of recordMultipleDependencies() for one single
> dependency, and that most of these are actually pinned. So we finish
> by doing a lot of slot manipulation to insert nothing at the end,
> contrary to the counterparts with pg_shdepend and pg_attribute.
Maybe it'd be worth pre-computing by a first pass which tracks pinned objects
in a bitmap; with a second pass which then knows how many and which to insert
into slots?
> In
> short, I think that for now it would be fine to commit a patch that
> does the multi-INSERT optimization for pg_attribute and pg_shdepend,
> but that we need more careful work for pg_depend.
Fair enough, let's break out pg_depend and I'll have another go at that.
cheers ./daniel
From | Date | Subject | |
---|---|---|---|
Next Message | Thomas Munro | 2020-07-29 22:29:52 | Re: recovering from "found xmin ... from before relfrozenxid ..." |
Previous Message | Daniel Gustafsson | 2020-07-29 20:27:31 | Re: PATCH: Add uri percent-encoding for binary data |