From: | Zhihong Yu <zyu(at)yugabyte(dot)com> |
---|---|
To: | Yura Sokolov <y(dot)sokolov(at)postgrespro(dot)ru> |
Cc: | Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>, PostgreSQL Developers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, michail(dot)nikolaev(at)gmail(dot)com, Andrey Borodin <x4mmm(at)yandex-team(dot)ru>, Andres Freund <andres(at)anarazel(dot)de>, Simon Riggs <simon(dot)riggs(at)enterprisedb(dot)com> |
Subject: | Re: BufferAlloc: don't take two simultaneous locks |
Date: | 2022-03-13 14:05:10 |
Message-ID: | CALNJ-vQP2QvLMfzk2z8s+2g5mt7Yir3yOtgmiRZiHJk+Nuq+yw@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Sun, Mar 13, 2022 at 3:25 AM Yura Sokolov <y(dot)sokolov(at)postgrespro(dot)ru>
wrote:
> В Пт, 11/03/2022 в 17:21 +0900, Kyotaro Horiguchi пишет:
> > At Fri, 11 Mar 2022 15:49:49 +0900 (JST), Kyotaro Horiguchi <
> horikyota(dot)ntt(at)gmail(dot)com> wrote in
> > > At Fri, 11 Mar 2022 15:30:30 +0900 (JST), Kyotaro Horiguchi <
> horikyota(dot)ntt(at)gmail(dot)com> wrote in
> > > > Thanks! I looked into dynahash part.
> > > >
> > > > struct HASHHDR
> > > > {
> > > > - /*
> > > > - * The freelist can become a point of contention in
> high-concurrency hash
> > > >
> > > > Why did you move around the freeList?
>
> This way it is possible to allocate just first partition, not all 32
> partitions.
>
> >
> > Then I looked into bufmgr part. It looks fine to me but I have some
> > comments on code comments.
> >
> > > * To change the association of a valid buffer, we'll
> need to have
> > > * exclusive lock on both the old and new mapping
> partitions.
> > > if (oldFlags & BM_TAG_VALID)
> >
> > We don't take lock on the new mapping partition here.
>
> Thx, fixed.
>
> > + * Clear out the buffer's tag and flags. We must do this to
> ensure that
> > + * linear scans of the buffer array don't think the buffer is
> valid. We
> > + * also reset the usage_count since any recency of use of the
> old content
> > + * is no longer relevant.
> > + *
> > + * We are single pinner, we hold buffer header lock and exclusive
> > + * partition lock (if tag is valid). Given these statements it
> is safe to
> > + * clear tag since no other process can inspect it to the moment.
> >
> > This comment is a merger of the comments from InvalidateBuffer and
> > BufferAlloc. But I think what we need to explain here is why we
> > invalidate the buffer here despite of we are going to reuse it soon.
> > And I think we need to state that the old buffer is now safe to use
> > for the new tag here. I'm not sure the statement is really correct
> > but clearing-out actually looks like safer.
>
> I've tried to reformulate the comment block.
>
> >
> > > Now it is safe to use victim buffer for new tag. Invalidate the
> > > buffer before releasing header lock to ensure that linear scans of
> > > the buffer array don't think the buffer is valid. It is safe
> > > because it is guaranteed that we're the single pinner of the buffer.
> > > That pin also prevents the buffer from being stolen by others until
> > > we reuse it or return it to freelist.
> >
> > So I want to revise the following comment.
> >
> > - * Now it is safe to use victim buffer for new tag.
> > + * Now reuse victim buffer for new tag.
> > > * Make sure BM_PERMANENT is set for buffers that must be
> written at every
> > > * checkpoint. Unlogged buffers only need to be written at
> shutdown
> > > * checkpoints, except for their "init" forks, which need to be
> treated
> > > * just like permanent relations.
> > > *
> > > * The usage_count starts out at 1 so that the buffer can
> survive one
> > > * clock-sweep pass.
> >
> > But if you think the current commet is fine, I don't insist on the
> > comment chagnes.
>
> Used suggestion.
>
> Fr, 11/03/22 Yura Sokolov wrote:
> > В Пт, 11/03/2022 в 15:49 +0900, Kyotaro Horiguchi пишет:
> > > BufTableDelete considers both reuse and !reuse cases but
> > > BufTableInsert doesn't and always does HASH_ASSIGN. That looks
> > > odd. We should use HASH_ENTER here. Thus I think it is more
> > > reasonable that HASH_ENTRY uses the stashed entry if exists and
> > > needed, or returns it to freelist if exists but not needed.
> > >
> > > What do you think about this?
> >
> > Well... I don't like it but I don't mind either.
> >
> > Code in HASH_ENTER and HASH_ASSIGN cases differs much.
> > On the other hand, probably it is possible to merge it carefuly.
> > I'll try.
>
> I've merged HASH_ASSIGN into HASH_ENTER.
>
> As in previous letter, three commits are concatted to one file
> and could be applied with `git am`.
>
> -------
>
> regards
>
> Yura Sokolov
> Postgres Professional
> y(dot)sokolov(at)postgrespro(dot)ru
> funny(dot)falcon(at)gmail(dot)com
Hi,
In the description:
There is no need to hold both lock simultaneously.
both lock -> both locks
+ * We also reset the usage_count since any recency of use of the old
recency of use -> recent use
+BufTableDelete(BufferTag *tagPtr, uint32 hashcode, bool reuse)
Later on, there is code:
+ reuse ? HASH_REUSE : HASH_REMOVE,
Can flag (such as HASH_REUSE) be passed to BufTableDelete() instead of bool
? That way, flag can be used directly in the above place.
+ long nalloced; /* number of entries initially allocated for
nallocated isn't very long. I think it would be better to name the
field nallocated* '*nallocated'.
+ sum += hashp->hctl->freeList[i].nalloced;
+ sum -= hashp->hctl->freeList[i].nfree;
I think it would be better to calculate the difference between nalloced and
nfree first, then add the result to sum (to avoid overflow).
Subject: [PATCH 3/3] reduce memory allocation for non-partitioned dynahash
memory allocation -> memory allocations
Cheers
From | Date | Subject | |
---|---|---|---|
Next Message | vignesh C | 2022-03-13 17:03:43 | Tab completion not listing schema list for create/alter publication for all tables in schema |
Previous Message | Dilip Kumar | 2022-03-13 11:35:18 | Re: Support logical replication of DDLs |