Re: Next Steps with Hash Indexes

From: Simon Riggs <simon(dot)riggs(at)enterprisedb(dot)com>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: PostgreSQL Developers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Next Steps with Hash Indexes
Date: 2021-07-23 12:45:49
Message-ID: CANbhV-Gjrzo1Vcu4yTqyG9E-nkh1RcHn-uJDuW8biQL_o8DLKw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, 22 Jul 2021 at 06:10, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:

> It will surely work if we have an exclusive lock on both the buckets
> (old and new) in this case but I think it is better if we can avoid
> exclusive locking the old bucket (bucket_being_split) unless it is
> really required. We need an exclusive lock on the primary bucket where
> we are trying to insert to avoid any other inserter with the same key
> but I think we don't need it for the old bucket because no inserter
> with the same key can try to insert the key in an old bucket which
> would belong to the new bucket.

Agreed.

> > I don't think btree does that, so I'm not sure we do need that for
> > hash. Yes, there is a race condition, but someone will win. Do we care
> > who? Do we care enough to take the concurrency hit?
> >
>
> I think if we don't care we might end up with duplicates. I might be
> missing something here but let me explain the problem I see. Say,
> while doing a unique check we found the same hash value in the bucket
> we are trying to insert, we can't say unique key violation at this
> stage and error out without checking the actual value in heap. This is
> because there is always a chance that two different key values can map
> to the same hash value. Now, after checking in the heap if we found
> that the actual value doesn't match so we decide to insert the value
> in the hash index, and in the meantime, another insert of the same key
> value already performed these checks and ends up inserting the value
> in hash index and that would lead to a duplicate value in the hash
> index. I think btree doesn't have a similar risk so we don't need such
> a mechanism for btree.

Agreed, after thinking about it more while coding.

All of the above implemented in the patches below:

Complete patch for hash_multicol.v3.patch attached, slightly updated
from earlier patch.
Docs, tests, passes make check.

WIP for hash_unique.v4.patch attached, patch-on-patch, to allow us to
discuss flow of logic and shape of code.
The actual comparison is not implemented yet. Not trivial, but can
wait until we decide main logic.
Passes make check and executes attached tests.

Tests in separate file also attached, will eventually be merged into
src/test/regress/sql/hash_index.sql

No tests yet for splitting or deferred uniqueness checks. The latter
is because there are parse analysis changes needed to undo the
assumption that only btrees support uniqueness, but nothing there
looks like an issue.

Thanks for your input, have a good weekend.

--
Simon Riggs http://www.EnterpriseDB.com/

Attachment Content-Type Size
hash_unique.v4.patch application/octet-stream 24.7 KB
hash_multicol.v3.patch application/octet-stream 12.2 KB
test_hash_unique.sql application/octet-stream 1.6 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2021-07-23 13:22:32 Re: Delegating superuser tasks to new security roles (Was: Granting control of SUSET gucs to non-superusers)
Previous Message vignesh C 2021-07-23 12:17:51 Re: Added schema level support for publication.