From: | Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com> |
---|---|
To: | Alexander Korotkov <aekorotkov(at)gmail(dot)com> |
Cc: | Masahiko Sawada <masahiko(dot)sawada(at)2ndquadrant(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Andres Freund <andres(at)anarazel(dot)de>, Mark Dilger <hornschnorter(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: WIP: BRIN multi-range indexes |
Date: | 2020-08-04 15:17:43 |
Message-ID: | 20200804151743.dbobvzovtnpgfpn7@development |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Tue, Aug 04, 2020 at 05:36:51PM +0300, Alexander Korotkov wrote:
>Hi, Tomas!
>
>Sorry for the late reply.
>
>On Sun, Jul 19, 2020 at 6:19 PM Tomas Vondra
><tomas(dot)vondra(at)2ndquadrant(dot)com> wrote:
>> I think there's a number of weak points in this approach.
>>
>> Firstly, it assumes the summaries can be represented as arrays of
>> built-in types, which I'm not really sure about. It clearly is not true
>> for the bloom opclasses, for example. But even for minmax oclasses it's
>> going to be tricky because the ranges may be on different data types so
>> presumably we'd need somewhat nested data structure.
>>
>> Moreover, multi-minmax summary contains either points or intervals,
>> which requires additional fields/flags to indicate that. That further
>> complicates the things ...
>>
>> maybe we could decompose that into separate arrays or something, but
>> honestly it seems somewhat premature - there are far more important
>> aspects to discuss, I think (e.g. how the ranges are built/merged in
>> multi-minmax, or whether bloom opclasses are useful at all).
>
>I see. But there is at least a second option to introduce a new
>datatype with just an output function. In the similar way
>gist/tsvector_ops uses gtsvector key type. I think it would be more
>transparent than using just bytea. Also, this is the way we already
>use in the core.
>
So you're proposing to have a new data types "brin_minmax_multi_summary"
and "brin_bloom_summary" (or some other names), with output functions
printing something nicer? I suppose that could work, and we could even
add pageinspect functions returning the value as raw bytea.
Good idea!
>> >BTW, I've applied the patchset to the current master, but I got a lot
>> >of duplicate oids. Could you please resolve these conflicts. I think
>> >it would be good to use high oid numbers to evade conflicts during
>> >development/review, and rely on committer to set final oids (as
>> >discussed in [1]).
>> >
>> >Links
>> >1. https://www.postgresql.org/message-id/CAH2-WzmMTGMcPuph4OvsO7Ykut0AOCF_i-%3DeaochT0dd2BN9CQ%40mail.gmail.com
>>
>> Did you use the patchset from 2020/07/03? I don't get any duplicate OIDs
>> with it, and it's already using quite high OIDs (part 4 uses >= 8000,
>> part 5 uses >= 9000).
>
>Yep, it appears that I was using the wrong version of patchset.
>Patchset from 2020/07/03 works good on the current master.
>
OK, good.
regards
--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
From | Date | Subject | |
---|---|---|---|
Next Message | Stephen Frost | 2020-08-04 15:18:37 | Re: LSM tree for Postgres |
Previous Message | Tomas Vondra | 2020-08-04 15:11:36 | Re: LSM tree for Postgres |