From: | Alena Rybakina <a(dot)rybakina(at)postgrespro(dot)ru> |
---|---|
To: | jian he <jian(dot)universality(at)gmail(dot)com>, Alexander Korotkov <aekorotkov(at)gmail(dot)com> |
Cc: | Andrei Lepikhov <lepihov(at)gmail(dot)com>, Nikolay Shaplov <dhyan(at)nataraj(dot)su>, pgsql-hackers(at)lists(dot)postgresql(dot)org, Robert Haas <robertmhaas(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org, Peter Geoghegan <pg(at)bowt(dot)ie>, Marcos Pegoraro <marcos(at)f10(dot)com(dot)br>, teodor(at)sigaev(dot)ru, Peter Eisentraut <peter(at)eisentraut(dot)org>, Ranier Vilela <ranier(dot)vf(at)gmail(dot)com> |
Subject: | Re: POC, WIP: OR-clause support for indexes |
Date: | 2024-08-26 10:41:01 |
Message-ID: | fe11e5c9-877e-4306-a6a9-122125c7d2e5@postgrespro.ru |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hi!
On 26.08.2024 06:12, jian he wrote:
> On Fri, Aug 23, 2024 at 8:58 PM Alexander Korotkov<aekorotkov(at)gmail(dot)com> wrote:
> based on v37.
>
>
> + {
> + /*
> + * We have only Const's. In this case we can construct an array
> + * directly.
> + */
> + int16 typlen;
> + bool typbyval;
> + char typalign;
> + Datum *elems;
> + int i = 0;
> + ArrayType *arrayConst;
> +
> + get_typlenbyvalalign(consttype, &typlen, &typbyval, &typalign);
> +
> + elems = (Datum *) palloc(sizeof(Datum) * list_length(consts));
> + foreach(lc, consts)
> + elems[i++] = ((Const *) lfirst(lc))->constvalue;
> +
> + arrayConst = construct_array(elems, i, consttype,
> + typlen, typbyval, typalign);
> + arrayNode = (Node *) makeConst(arraytype, -1, inputcollid,
> + -1, PointerGetDatum(arrayConst),
> + false, false);
> +
> + pfree(elems);
> + list_free(consts);
> + }
> List "consts" elements can be NULL?
> I didn't find a query to trigger that.
> but construct_array comments says
> "elems (NULL element values are not supported)."
> Do we need to check Const->constisnull for the Const node?
I didn't find any problems here either, but the query plan seems strange
to me: a form of OR expressions is added to the recheck condition. But
we discussed this before and came to the conclusion that this is not a
mistake.
I added the query to the create_index.sql
EXPLAIN (COSTS OFF)
+SELECT * FROM tenk1
+ WHERE thousand = 42 AND (tenthous = 1 OR tenthous = 3 OR tenthous =
42 or tenthous is null);
+ QUERY PLAN
+-------------------------------------------------------------------------------------------------------------------------------------------
+ Bitmap Heap Scan on tenk1
+ Recheck Cond: (((thousand = 42) AND (tenthous IS NULL)) OR
((thousand = 42) AND ((tenthous = 1) OR (tenthous = 3) OR (tenthous = 42))))
+ Filter: ((tenthous = 1) OR (tenthous = 3) OR (tenthous = 42) OR
(tenthous IS NULL))
+ -> BitmapOr
+ -> Bitmap Index Scan on tenk1_thous_tenthous
+ Index Cond: ((thousand = 42) AND (tenthous IS NULL))
+ -> Bitmap Index Scan on tenk1_thous_tenthous
+ Index Cond: ((thousand = 42) AND (tenthous = ANY
('{1,3,42}'::integer[])))
+(8 rows)
I noticed that the NULL element is not added to the converted array
because it belongs to a different group.
So, I think this problem may not be affect us. Gere we should add
Assertion that an element is not null.
> + /* Construct the list of nested OR arguments */
> + for (j = group_start; j < i; j++)
> + {
> + Node *arg = list_nth(orargs, matches[j].argindex);
> +
> + rargs = lappend(rargs, arg);
> + if (IsA(arg, RestrictInfo))
> + args = lappend(args, ((RestrictInfo *) arg)->clause);
> + else
> + args = lappend(args, arg);
> + }
> the ELSE branch never reached?
>
Reached - if your arg is BoolExpr type, for example if it consists "And"
expressions.
> + /* Construct the nested OR and wrap it with RestrictInfo */
> + *subrinfo = *rinfo;
> + subrinfo->clause = make_orclause(args);
> + subrinfo->orclause = make_orclause(rargs);
> + result = lappend(result, subrinfo);
> should we use memcpy instead of " *subrinfo = *rinfo;"?
>
>
> + /* Sort clauses to make similar clauses go together */
> + pg_qsort(matches, n, sizeof(OrArgIndexMatch), or_arg_index_match_cmp);
> Should we use qsort?
> since comments in pg_qsort:
> /*
> * Callers should use the qsort() macro defined below instead of calling
> * pg_qsort() directly.
> */
I think yes, we should.
> +/*
> + * Data structure representing information about OR-clause argument and its
> + * matching index key. Used for grouping of similar OR-clause arguments in
> + * group_similar_or_args().
> + */
> +typedef struct
> +{
> + int indexnum; /* index of the matching index */
> + int colnum; /* index of the matching column */
> + Oid opno; /* OID of the OpClause operator */
> + Oid inputcollid; /* OID of the OpClause input collation */
> + int argindex; /* index of the clause in the list of
> + * arguments */
> +} OrArgIndexMatch;
>
> I am not 100% sure about the comments.
> indexnum: index of the matching index reside in rel->indexlist that
> matches (counting from 0)
> colnum: the column number of the matched index (counting from 0)
>
To be honest, I'm not sure that I completely understand your point here.
I have found an interesting case here:
+EXPLAIN (COSTS OFF) +SELECT * FROM tenk1 + WHERE thousand = 42 AND
(stringu1 = 'MAAAAA' OR stringu1 = 'TUAAAA'::text OR stringu1 =
'OBAAAA'::text); + QUERY PLAN
+-------------------------------------------------------------------------------------------------------------------------------------
+ Bitmap Heap Scan on tenk1 + Recheck Cond: ((thousand = 42) AND
((stringu1 = 'MAAAAA'::name) OR ((stringu1 = 'TUAAAA'::text) OR
(stringu1 = 'OBAAAA'::text)))) + Filter: ((stringu1 = 'MAAAAA'::name) OR
(stringu1 = 'TUAAAA'::text) OR (stringu1 = 'OBAAAA'::text)) + ->
BitmapAnd + -> Bitmap Index Scan on tenk1_thous_tenthous + Index Cond:
(thousand = 42) + -> BitmapOr + -> Bitmap Index Scan on stringu1_idx +
Index Cond: (stringu1 = 'MAAAAA'::name) + -> Bitmap Index Scan on
stringu1_idx + Index Cond: (stringu1 = ANY ('{TUAAAA,OBAAAA}'::text[]
COLLATE "C")) +(11 rows) +
If OR constants have different types, then they belong to different
groups, and I think that's unfair. I think that conversion to a single
type should be used here - while I’m working on this, I’ll send the code
in the next letter.
And I noticed that there were some tests missing on this, so I added this.
I've updated the patch file to include my and Jian's suggestions, as
well as the diff file if there's no objection.
--
Regards,
Alena Rybakina
Postgres Professional:http://www.postgrespro.com
The Russian Postgres Company
Attachment | Content-Type | Size |
---|---|---|
diff.diff.no-cfbot | text/plain | 6.2 KB |
v38-0001-Transform-OR-clauses-to-SAOP-s-during-index-matching.patch | text/x-patch | 41.6 KB |
v38-0002-Teach-bitmap-path-generation-about-transforming-OR-c.patch | text/x-patch | 28.4 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Dean Rasheed | 2024-08-26 10:48:03 | Re: Optimize mul_var() for var1ndigits >= 8 |
Previous Message | shveta malik | 2024-08-26 10:36:29 | Re: Conflict detection and logging in logical replication |