Re: POC, WIP: OR-clause support for indexes

From: Alena Rybakina <lena(dot)ribackina(at)yandex(dot)ru>
To: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>, Ranier Vilela <ranier(dot)vf(at)gmail(dot)com>
Cc: Marcos Pegoraro <marcos(at)f10(dot)com(dot)br>, Andrey Lepikhov <a(dot)lepikhov(at)postgrespro(dot)ru>, teodor(at)sigaev(dot)ru
Subject: Re: POC, WIP: OR-clause support for indexes
Date: 2023-06-29 09:55:58
Message-ID: 2ff29fc7-ef29-b2bf-0d97-6c2f076fed1b@yandex.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

I apologize for breaks the original thread. In my defense, I can say
that I'm new to all this and I'm just learning. I will try to make as
few mistakes as possible.

I try to fix it by forwarding this message to you, besides it might be
interesting to you too. This message to you, because it might be
interesting to you too.

I'm sorry if I didn't state my goals clearly at first, but it seemed to
me that initially the problem I encountered was very similar to what is
described in this thread, only I suggested a slightly different way to
solve it.

I have described the problem more or less clearly here [1] and the worst
case, as it seems to me, too, but if this is not the case, let me know.

1.
https://www.mail-archive.com/pgsql-hackers(at)lists(dot)postgresql(dot)org/msg146230.html

> On 29.06.2023 12:32, Alena Rybakina wrote:
>>
>> Hi! I'm sorry I didn't answer you right away, I was too busy with work.
>>
>> On 27.06.2023 22:50, Peter Geoghegan wrote:
>>> On Tue, Jun 27, 2023 at 6:19 AM Alena Rybakina<lena(dot)ribackina(at)yandex(dot)ru> wrote:
>>>> I learned something new from your letter, thank you very much for that!
>>> Cool. The MDAM paper is also worth a read:
>>>
>>> https://vldb.org/conf/1995/P710.PDF
>>>
>>> Some of the techniques it describes are already in Postgres. With
>>> varying degrees of maturity.
>>>
>>> The paper actually mentions OR optimization at one point, under
>>> "Duplicate Elimination". The general idea is that ScalarArrayOpExpr
>>> execution can "eliminate duplicates before the data is read". The
>>> important underlying principle is that it can be really useful to give
>>> the B-Tree code the context it requires to be clever about stuff like
>>> that. We can do this by (say) using one ScalarArrayOpExpr, rather than
>>> using two or more index scans that the B-Tree code will treat as
>>> independent things. So a lot of the value in your patch comes from the
>>> way that it can enable other optimizations (the immediate benefits are
>>> also nice).
>>>
>>> In the past, OR optimizations have been prototyped that were later
>>> withdrawn/rejected because the duplicate elimination aspect was...too
>>> scary [1]. It's very easy to see that ScalarArrayOpExpr index scans
>>> don't really have the same problem. "Giving the B-Tree code the
>>> required context" helps here too.
>>>
>> Thank you for the explanation and the material provided)
>> unfortunately, I am still only studying the article and at the moment
>> I cannot write more. To be honest, I didn't think about the fact that
>> my optimization can help eliminate duplicates before reading the data
>> before.
>>
>> I am still only in the process of familiarizing myself with the
>> thread [1] (reference from your letter), but I have already seen that
>> there are problems related, for example, to when "or" expressions
>> refer to the parent element.
>>
>> I think, I would face the similar problems if I complicate the
>> current code, for example, so that not only or expressions standing
>> on the same level are written in any, but also on different ones
>> without violating the logic of the priority of executing operators.
>>
>> For example, this query works now:
>>
>> postgres=# EXPLAIN (analyze, COSTS OFF)
>> SELECT oid,relname FROM pg_class
>> WHERE
>>   (oid = 13779 OR oid = 2) OR (oid = 4 OR oid = 5) OR
>>   relname = 'pg_extension'
>> ;
>>
>>                                                     QUERY PLAN
>> ------------------------------------------------------------------------------------------------------------------
>>  Seq Scan on pg_class (actual time=0.086..0.140 rows=1 loops=1)
>>    Filter: ((oid = ANY ('{4,5}'::oid[])) OR (oid = ANY
>> ('{13779,2}'::oid[])) OR (relname = 'pg_extension'::name))
>>    Rows Removed by Filter: 412
>>  Planning Time: 2.135 ms
>>  Execution Time: 0.160 ms
>> (5 rows)
>>
>> But I would like it works such as:
>>
>>                                       QUERY PLAN
>> --------------------------------------------------------------------------------------
>>  Seq Scan on pg_class (actual time=0.279..0.496 rows=1 loops=1)
>>    Filter: ((oid = ANY ('{13779,2,4,5}'::oid[])) OR (relname =
>> 'pg_extension'::name))
>>    Rows Removed by Filter: 412
>>  Planning Time: 0.266 ms
>>  Execution Time: 0.536 ms
>> (5 rows)
>>
>>>> I analyzed the buffer consumption when I ran control regression tests using my patch. diff shows me that there is no difference between the number of buffer block scans without and using my patch, as far as I have seen. (regression.diffs)
>>> To be clear, I wasn't expecting that there'd be any regressions from
>>> your patch. Intuitively, it seems like this optimization should make
>>> the query plan do almost the same thing at execution time -- just
>>> slightly more efficiently on average, and much more efficiently in
>>> some individual cases.
>>>
>>> It would probably be very hard for the optimizer to model/predict how
>>> much work it can save by using a ScalarArrayOpExpr instead of an
>>> "equivalent" set of bitmap index scans, OR'd together. But it doesn't
>>> necessarily matter -- the only truly critical detail is understanding
>>> the worst case for the transformation optimization.
>> Yes, I agree with you and I have yet to analyze this.
>>> It cannot be too
>>> bad (maybe it's ~zero added runtime overhead relative to not doing the
>>> transformation, even?).
>> I haven't seen a major performance degradation so far, but to be
>> honest, I have not conducted a detailed analysis on other types of
>> queries other than x=1 or x=2 or x=1 or y=2, etc. As soon as
>> something is known, I will provide the data, it is very interesting
>> to me.
>>> At the same time, nbtree can be clever about
>>> ScalarArrayOpExpr execution at runtime (once that's implemented),
>>> without ever needing to make any kind of up-front commitment to
>>> navigating through the index in any particular way. It's all dynamic,
>>> and can be driven by the actual observed characteristics of the index
>>> structure.
>>>
>>> In other words, we don't really need to gamble (in the planner, or at
>>> execution time). We're just keeping our options open in more cases.
>>> (My thinking on these topics was influenced by Goetz Graefe -- "choice
>>> is confusion" [2]).
>>
>> Unfortunately, when I tried to make a transformation at the stage of
>> index formation, I encountered too incorrect an assessment of the
>> selectivity of relation, which affected the incorrect calculation of
>> the cost and cardinality. I couldn't solve this problem.
>>
>> My diff (transform_or_v0.diff). I got this result:
>>
>> CREATE TABLE tenk1 (unique1int, unique2int, tenint, hundredint);
>> insert into tenk1 SELECT x,x,x,x FROM generate_series(1,50000) as x;
>> CREATE INDEX a_idx1 ON tenk1(unique1);
>> CREATE INDEX a_idx2 ON tenk1(unique2);
>> CREATE INDEX a_hundred ON tenk1(hundred);
>>
>> postgres=# explain analyze
>> select * from tenk1 a join tenk1 b on
>> a.unique2 = 3 or a.unique2 = 7 or a.unique1 = 1;
>> QUERY PLAN
>> ----------------------------------------------------------------------------------------------------------------------
>> Nested Loop (cost=0.00..15627479.50 rows=1250050000 width=32) (actual time=0.040..75.531 rows=150000 loops=1)
>> -> Seq Scan on tenk1 b (cost=0.00..771.00 rows=50000 width=16) (actual time=0.022..5.467 rows=50000 loops=1)
>> -> Materialize (cost=0.00..1146.01 rows=25001 width=16) (actual time=0.000..0.001 rows=3 loops=50000)
>> -> Seq Scan on tenk1 a (cost=0.00..1021.00 rows=25001 width=16) (actual time=0.011..22.789 rows=3 loops=1)
>> Filter: ((unique2 = ANY (ARRAY[3, 7])) OR (unique1 = 1))
>> Rows Removed by Filter: 49997
>> Planning Time: 0.427 ms
>> Execution Time: 80.027 ms
>> (8 rows)
>>
>> The current patch's result:
>>
>> postgres=# set enable_bitmapscan ='off';
>> SET
>> postgres=# explain analyze
>> select * from tenk1 a join tenk1 b on
>> a.unique2 = 3 or a.unique2 = 7 or a.unique1 = 1;
>> QUERY PLAN
>> ----------------------------------------------------------------------------------------------------------------------
>> Nested Loop (cost=0.00..22247.02 rows=1350000 width=32) (actual time=0.094..373.627 rows=1350000 loops=1)
>> -> Seq Scan on tenk1 b (cost=0.00..2311.00 rows=150000 width=16) (actual time=0.051..14.667 rows=150000 loops=1)
>> -> Materialize (cost=0.00..3061.05 rows=9 width=16) (actual time=0.000..0.001 rows=9 loops=150000)
>> -> Seq Scan on tenk1 a (cost=0.00..3061.00 rows=9 width=16) (actual time=0.026..42.389 rows=9 loops=1)
>> Filter: ((unique2 = ANY ('{3,7}'::integer[])) OR (unique1 = 1))
>> Rows Removed by Filter: 149991
>> Planning Time: 0.414 ms
>> Execution Time: 409.154 ms
>> (8 rows)
>>
>>> [1]https://www.postgresql.org/message-id/flat/1397.1486598083%40sss.pgh.pa.us#310f974a8dc84478d6d3c70f336807bb
>>> [2]https://sigmodrecord.org/publications/sigmodRecord/2009/pdfs/05_Profiles_Graefe.pdf
>> Thank you again for the explanations and the material provided. I
>> will carefully study everything as soon as possible and will write if
>> there are any thoughts or if there are ideas about my patch.

--
Regards,
Alena Rybakina
Postgres Professional

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tomas Vondra 2023-06-29 09:58:27 Re: Changing types of block and chunk sizes in memory contexts
Previous Message torikoshia 2023-06-29 09:42:37 Re: pg_rewind WAL segments deletion pitfall