From: | Ranier Vilela <ranier(dot)vf(at)gmail(dot)com> |
---|---|
To: | Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com> |
Cc: | pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: Avoid overhead open-close indexes (catalog updates) |
Date: | 2022-09-01 11:42:15 |
Message-ID: | CAEudQAqgBCXO13jj-ykB0ygTC3RFNSaNjr59W1OhEXr5fggoww@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Em qua., 31 de ago. de 2022 às 22:12, Kyotaro Horiguchi <
horikyota(dot)ntt(at)gmail(dot)com> escreveu:
> At Wed, 31 Aug 2022 08:16:55 -0300, Ranier Vilela <ranier(dot)vf(at)gmail(dot)com>
> wrote in
> > Hi,
> >
> > The commit
> >
> https://github.com/postgres/postgres/commit/b17ff07aa3eb142d2cde2ea00e4a4e8f63686f96
> > Introduced the CopyStatistics function.
> >
> > To do the work, CopyStatistics uses a less efficient function
> > to update/insert tuples at catalog systems.
> >
> > The comment at indexing.c says:
> > "Avoid using it for multiple tuples, since opening the indexes
> > * and building the index info structures is moderately expensive.
> > * (Use CatalogTupleInsertWithInfo in such cases.)"
> >
> > So inspired by the comment, changed in some fews places,
> > the CatalogInsert/CatalogUpdate to more efficient functions
> > CatalogInsertWithInfo/CatalogUpdateWithInfo.
> >
> > With quick tests, resulting in small performance.
>
Hi,
Thanks for taking a look at this.
>
> Considering the whole operation usually takes far longer time, I'm not
> sure that amount of performance gain is useful or not, but I like the
> change as a matter of tidiness or as example for later codes.
>
Yeah, this serves as an example for future codes.
> > There are other places that this could be useful,
> > but a careful analysis is necessary.
>
> What kind of concern do have in your mind?
>
Code Bloat.
3 more lines are required per call (CatalogTupleInsert/CatalogTupleUpdate).
However not all code paths are reachable.
The ideal typical case would be CopyStatistics, I think.
With none or at least one filter in tuples loop.
The cost to call CatalogOpenIndexes unconditionally, should be considered.
>
> By the way, there is another similar function
> CatalogTupleMultiInsertWithInfo() which would be more time-efficient
> (but not space-efficient), which is used in InsertPgAttributeTuples. I
> don't see a clear criteria of choosing which one of the two, though.
>
> I don't think CatalogTupleMultiInsertWithInfo would be useful in these
cases reported here.
The cost of building the slots I think would be unfeasible and would add
unnecessary complexity.
> I think the overhead of catalog index open is significant when any
> other time-consuming tasks are not involved in the whole operation.
> In that sense, in term of performance, rather storeOperations and
> storePrecedures (called under DefineOpCalss) might get more benefit
> from that if disregarding the rareness of the command being used..
>
> Yeah, storeOperations and storePrecedures are good candidates.
Let's wait for the patch to be accepted and committed, so we can try to
change it.
I will create a CF entry.
regards,
Ranier Vilela
From | Date | Subject | |
---|---|---|---|
Next Message | Polina Bungina | 2022-09-01 11:58:04 | Re: pg_rewind WAL segments deletion pitfall |
Previous Message | Polina Bungina | 2022-09-01 11:33:09 | Re: pg_rewind WAL segments deletion pitfall |