Re: pgsql: dshash: Add sequential scan support.

From: Zhihong Yu <zyu(at)yugabyte(dot)com>
To: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
Cc: Andres Freund <andres(at)anarazel(dot)de>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pgsql: dshash: Add sequential scan support.
Date: 2022-07-04 10:46:11
Message-ID: CALNJ-vTe+RDn_Y7yDrSUDeqsXwtQt18qu=vqbfbujKxLROjEgg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-committers pgsql-hackers

On Sun, Jul 3, 2022 at 7:56 PM Thomas Munro <thomas(dot)munro(at)gmail(dot)com> wrote:

> [Re-directing to -hackers]
>
> On Fri, Mar 11, 2022 at 2:27 PM Andres Freund <andres(at)anarazel(dot)de> wrote:
> > On 2022-03-10 20:09:56 -0500, Tom Lane wrote:
> > > Andres Freund <andres(at)anarazel(dot)de> writes:
> > > > dshash: Add sequential scan support.
> > > > Add ability to scan all entries sequentially to dshash. The
> interface is
> > > > similar but a bit different both from that of dynahash and simple
> dshash
> > > > search functions. The most significant differences is that dshash's
> interfac
> > > > always needs a call to dshash_seq_term when scan ends.
> > >
> > > Umm ... what about error recovery? Or have you just cemented the
> > > proposition that long-lived dshashes are unsafe?
> >
> > I don't think this commit made it worse. dshash_seq_term() releases an
> lwlock
> > (which will be released in case of an error) and unsets
> > hash_table->find_[exclusively_]locked. The latter weren't introduced by
> this
> > patch, and are also set by dshash_find().
> >
> > I agree that ->find_[exclusively_]locked are problematic from an error
> > recovery perspective.
>
> Right, as seen in the build farm at [1]. Also reproducible with something
> like:
>
> @@ -269,6 +269,14 @@ dsm_impl_posix(dsm_op op, dsm_handle handle, Size
> request_size,
> return false;
> }
>
> + /* XXX random fault injection */
> + if (op == DSM_OP_ATTACH && random() < RAND_MAX / 8)
> + {
> + close(fd);
> + elog(ERROR, "chaos");
> + return false;
> + }
> +
>
> I must have thought that it was easy and practical to write no-throw
> straight-line code and be sure to reach dshash_release_lock(), but I
> concede that it was a bad idea: even dsa_get_address() can throw*, and
> you're often likely to need to call that while accessing dshash
> elements. For example, in lookup_rowtype_tupdesc_internal(), there is
> a sequence dshash_find(), ..., dsa_get_address(), ...,
> dshash_release_lock(), and I must have considered the range of code
> between find and release to be no-throw, but now I know that it is
> not.
>
> > It's per-backend state at least and just used for assertions. We could
> remove
> > it. Or stop checking it in places where it could be set wrongly:
> dshash_find()
> > and dshash_detach() couldn't check anymore, but the rest of the
> assertions
> > would still be valid afaics?
>
> Yeah, it's all for assertions... let's just remove it. Those
> assertions were useful to me at some stage in development but won't
> hold as well as I thought, at least without widespread PG_FINALLY(),
> which wouldn't be nice.
>
> *dsa_get_address() might need to adjust the memory map with system
> calls, which might fail. If you think of DSA as not only an allocator
> but also a poor man's user level virtual memory scheme to tide us over
> until we get threads, then this is a pretty low level kind of
> should-not-happen failure that is analogous on some level to SIGBUS or
> SIGSEGV or something like that, and we should PANIC. Then we could
> claim that dsa_get_address() is no-throw. At least, that was one
> argument I had with myself while investigating that strange Solaris
> shm_open() failure, but ... I lost the argument. It's quite an
> extreme position to take just to support these assertions, which are
> of pretty limited value.
>
> [1]
> https://www.postgresql.org/message-id/20220701232009.jcwxpl45bptaxv5n%40alap3.anarazel.de

Hi,
In the description,

`new shared memory stats system in 15`

It would be clearer to add `release` before `15`.

Cheers

In response to

Browse pgsql-committers by date

  From Date Subject
Next Message Alvaro Herrera 2022-07-04 13:14:05 pgsql: Implement List support for TransactionId
Previous Message Kyotaro Horiguchi 2022-07-04 08:31:46 Re: pgsql: dshash: Add sequential scan support.

Browse pgsql-hackers by date

  From Date Subject
Next Message Etsuro Fujita 2022-07-04 10:59:45 Re: Error from the foreign RDBMS on a foreign table I have no privilege on
Previous Message Amit Kapila 2022-07-04 10:18:33 Re: Handle infinite recursion in logical replication setup