From: | John Naylor <john(dot)naylor(at)enterprisedb(dot)com> |
---|---|
To: | Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com> |
Cc: | PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Subject: | Re: WIP: BRIN multi-range indexes |
Date: | 2021-01-26 18:52:53 |
Message-ID: | CAFBsxsFudhzy1gUMp6fyj7xDXqZf5VPGC3krqsz42_0QGwcBBQ@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Fri, Jan 22, 2021 at 10:59 PM Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>
wrote:
>
>
> On 1/23/21 12:27 AM, John Naylor wrote:
> > Still, it would be great if multi-minmax can be a drop in replacement. I
> > know there was a sticking point of a distance function not being
> > available on all types, but I wonder if that can be remedied or worked
> > around somehow.
> >
>
> Hmm. I think Alvaro also mentioned he'd like to use this as a drop-in
> replacement for minmax (essentially, using these opclasses as the
> default ones, with the option to switch back to plain minmax). I'm not
> convinced we should do that - though. Imagine you have minmax indexes in
> your existing DB, it's working perfectly fine, and then we come and just
> silently change that during dump/restore. Is there some past example
> when we did something similar and it turned it to be OK?
I was assuming pg_dump can be taught to insert explicit opclasses for
minmax indexes, so that upgrade would not cause surprises. If that's true,
only new indexes would have the different default opclass.
> As for the distance functions, I'm pretty sure there are data types
> without "natural" distance - like most strings, for example. We could
> probably invent something, but the question is how much we can rely on
> it working well enough in practice.
>
> Of course, is minmax even the right index type for such data types?
> Strings are usually "labels" and not queried using range queries,
> although sometimes people encode stuff as strings (but then it's very
> unlikely we'll define the distance definition well). So maybe for those
> types a hash / bloom would be a better fit anyway.
Right.
> But I do have an idea - maybe we can do without distances, in those
> cases. Essentially, the primary issue of minmax indexes are outliers, so
> what if we simply sort the values, keep one range in the middle and as
> many single points on each tail?
That's an interesting idea. I think it would be a nice bonus to try to do
something along these lines. On the other hand, I'm not the one
volunteering to do the work, and the patch is useful as is.
--
John Naylor
EDB: http://www.enterprisedb.com
From | Date | Subject | |
---|---|---|---|
Next Message | Finnerty, Jim | 2021-01-26 19:06:57 | Re: Challenges preventing us moving to 64 bit transaction id (XID)? |
Previous Message | Jacob Champion | 2021-01-26 18:43:03 | Re: Allow matching whole DN from a client certificate |