From: | Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> |
---|---|
To: | Robert Haas <robertmhaas(at)gmail(dot)com> |
Cc: | "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: _hash_addovflpage has a bug |
Date: | 2017-01-07 04:32:02 |
Message-ID: | CAA4eK1+Gk6efYKBUfXnQZGx+wDWAi48uRHpOnJ8S5KLEq1xy1Q@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Sat, Jan 7, 2017 at 2:33 AM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> It looks to to me like the recent hash index changes have left
> _hash_addovflpage slightly broken. I think that if that function
> reaches the point where it calls _hash_getbuf() to fetch the next page
> in the bucket chain, we also need to clear retain_pin. Otherwise,
> we'll erroneously think that we're supposed to retain a pin even when
> the current page is an overflow page rather than the primary bucket
> page, which is wrong.
>
How? I think we are ensuring that we retain pin only if it is a
bucket page. The check is as below:
if ((pageopaque->hasho_flag & LH_BUCKET_PAGE) && retain_pin)
> A larger question, I suppose, is why this function wants to be certain
> of adding a new page even if someone else has already done so. It
> seems like it might be smarter for it to just return the newly added
> page to the caller and let the caller try to use it. _hash_doinsert
> has an Assert() that the tuple fits on the returned page, but that
> could be deleted. As it stands, if two backends try to insert a tuple
> into the same full page at the same time, both of them will add an
> overflow page and we'll end up with 2 overflow pages each containing 1
> tuple.
>
I think it is because in the current algorithm we first get an
overflow page and then add it to the end. Now we can change it such
that first, we acquire the lock on the tail page, check if we still
need an overflow page, if so, then proceed, else return the already
added overflow page.
--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com
From | Date | Subject | |
---|---|---|---|
Next Message | Amit Kapila | 2017-01-07 05:47:05 | Re: Block level parallel vacuum WIP |
Previous Message | Bruce Momjian | 2017-01-07 01:21:01 | Re: [PATCH] Rename pg_switch_xlog to pg_switch_wal |