Quick Links

Re: Making all nbtree entries unique by having heap TIDs participate in comparisons

From:	Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
To:	Peter Geoghegan <pg(at)bowt(dot)ie>
Cc:	Dmitry Dolgov <9erthalion6(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Alexander Korotkov <a(dot)korotkov(at)postgrespro(dot)ru>, Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>, Claudio Freire <klaussfreire(at)gmail(dot)com>, Anastasia Lubennikova <a(dot)lubennikova(at)postgrespro(dot)ru>, "Andrey V(dot) Lepikhov" <a(dot)lepikhov(at)postgrespro(dot)ru>
Subject:	Re: Making all nbtree entries unique by having heap TIDs participate in comparisons
Date:	2019-03-07 07:41:37
Message-ID:	dfa4f0a6-96d5-0d6e-f57a-828ef35cb3af@iki.fi
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

On 07/03/2019 14:54, Peter Geoghegan wrote:
> On Wed, Mar 6, 2019 at 10:15 PM Heikki Linnakangas <hlinnaka(at)iki(dot)fi> wrote:
>> After staring at the first patch for bit longer, a few things started to
>> bother me:
>>
>> * The new struct is called BTScanInsert, but it's used for searches,
>> too. It makes sense when you read the README, which explains the
>> difference between "search scan keys" and "insertion scan keys", but now
>> that we have a separate struct for this, perhaps we call insertion scan
>> keys with a less confusing name. I don't know what to suggest, though.
>> "Positioning key"?
>
> I think that insertion scan key is fine. It's been called that for
> almost twenty years. It's not like it's an intuitive concept that
> could be conveyed easily if only we came up with a new, pithy name.

Yeah. It's been like that forever, but I must confess I hadn't paid any
attention to it, until now. I had not understood that text in the README
explaining the difference between search and insertion scan keys, before
looking at this patch. Not sure I ever read it with any thought. Now
that I understand it, I don't like the "insertion scan key" name.

> BTW, I would like to hear what you think of the idea of minusinfkey
> (and the !minusinfkey optimization) specifically.

I don't understand it :-(. I guess that's valuable feedback on its own.
I'll spend more time reading the code around that, but meanwhile, if you
can think of a simpler way to explain it in the comments, that'd be good.

>> The new BTInsertStateData struct also
>> holds the current buffer we're considering to insert to, and a
>> 'bounds_valid' flag to indicate if the saved bounds are valid for the
>> current buffer. That way, it's more straightforward to clear the
>> 'bounds_valid' flag whenever we move right.
>
> I'm not sure that that's an improvement. Moving right should be very
> rare with my patch. gcov shows that we never move right here anymore
> with the regression tests, or within _bt_check_unique() -- not once.

I haven't given performance much thought, really. I don't expect this
method to be any slower, but the point of the refactoring is to make the
code easier to understand.

>> /*
>> * Do the insertion. First move right to find the correct page to
>> * insert to, if necessary. If we're inserting to a non-unique index,
>> * _bt_search() already did this when it checked if a move to the
>> * right was required for leaf page. Insertion scankey's scantid
>> * would have been filled out at the time. On a unique index, the
>> * current buffer is the first buffer containing duplicates, however,
>> * so we may need to move right to the correct location for this
>> * tuple.
>> */
>> if (checkingunique || itup_key->heapkeyspace)
>> _bt_findinsertpage(rel, &insertstate, stack, heapRel);
>>
>> newitemoff = _bt_binsrch_insert(rel, &insertstate);
>>
>> _bt_insertonpg(rel, insertstate.buf, InvalidBuffer, stack, itup,
>> newitemoff, false);
>>
>> Does this make sense?
>
> I guess you're saying this because you noticed that the for (;;) loop
> in _bt_findinsertloc() doesn't do that much in many cases, because of
> the fastpath.

The idea is that _bt_findinsertpage() would not need to know whether the
unique checks were performed or not. I'd like to encapsulate all the
information about the "insert position we're considering" in the
BTInsertStateData struct. Passing 'checkingunique' as a separate
argument violates that, because when it's set, the key means something
slightly different.

Hmm. Actually, with patch #2, _bt_findinsertloc() could look at whether
'scantid' is set, instead of 'checkingunique'. That would seem better.
If it looks like 'checkingunique', it's making the assumption that if
unique checks were not performed, then we are already positioned on the
correct page, according to the heap TID. But looking at 'scantid' seems
like a more direct way of getting the same information. And then we
won't need to pass the 'checkingunique' flag as an "out-of-band" argument.

So I'm specifically suggesting that we replace this, in _bt_findinsertloc:

if (!checkingunique && itup_key->heapkeyspace)
break;

With this:

if (itup_key->scantid)
break;

And remove the 'checkingunique' argument from _bt_findinsertloc.

- Heikki

In response to

Re: Making all nbtree entries unique by having heap TIDs participate in comparisons at 2019-03-07 06:54:48 from Peter Geoghegan

Responses

Re: Making all nbtree entries unique by having heap TIDs participate in comparisons at 2019-03-07 07:58:58 from Heikki Linnakangas
Re: Making all nbtree entries unique by having heap TIDs participate in comparisons at 2019-03-07 08:37:43 from Peter Geoghegan
Re: Making all nbtree entries unique by having heap TIDs participate in comparisons at 2019-03-07 09:06:24 from Peter Geoghegan

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	David Steele	2019-03-07 07:42:41	Re: Re: [HACKERS] Custom compression methods
Previous Message	amul sul	2019-03-07 07:32:55	Re: [HACKERS] advanced partition matching algorithm for partition-wise join