RE: Improve EXPLAIN output for multicolumn B-Tree Index

From: <Masahiro(dot)Ikeda(at)nttdata(dot)com>
To: <boekewurm+postgres(at)gmail(dot)com>, <postgres(at)jeltef(dot)nl>
Cc: <pg(at)bowt(dot)ie>, <pgsql-hackers(at)lists(dot)postgresql(dot)org>, <Masao(dot)Fujii(at)nttdata(dot)com>
Subject: RE: Improve EXPLAIN output for multicolumn B-Tree Index
Date: 2024-07-01 02:53:51
Message-ID: TYWPR01MB1098258DC534763240D5ECE23B1D32@TYWPR01MB10982.jpnprd01.prod.outlook.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2024-06-28 21:05, Matthias van de Meent wrote:
> On Fri, 28 Jun 2024 at 10:59, Jelte Fennema-Nio <postgres(at)jeltef(dot)nl> wrote:
>>
>> On Fri, 28 Jun 2024 at 00:41, Peter Geoghegan <pg(at)bowt(dot)ie> wrote:
>> > Typically, no, it won't be. But there's really no telling for sure.
>> > The access patterns for a composite index on '(a, b)' with a qual
>> > "WHERE b = 5" are identical to a qual explicitly written "WHERE a =
>> > any(<every possible value in 'a'>) AND b = 5".
>>
>> Hmm, that's true. But in that case the explain plan gives a clear hint
>> that something like that might be going on, because you'll see:
>>
>> Index Cond: a = any(<every possible value in 'a'>) AND b = 5
>>
>> That does make me think of another way, and maybe more "correct" way,
>> of representing Masahiros situation than adding a new "Skip Scan Cond"
>> row to the EXPLAIN output. We could explicitly include a comparison to
>> all prefix columns in the Index Cond:
>>
>> Index Cond: ((test.id1 = 1) AND (test.id2 = ANY) AND (test.id3 = 101))
>>
>> Or if you want to go with syntactically correct SQL we could do the following:
>>
>> Index Cond: ((test.id1 = 1) AND ((test.id2 IS NULL) OR (test.id2 IS
>> NOT NULL)) AND (test.id3 = 101))
>>
>> An additional benefit this provides is that you now know which
>> additional column you should use a more specific filter on to speed up
>> the query. In this case test.id2
>>
>> OTOH, maybe it's not a great way because actually running that puts
>> the IS NULL+ IS NOT NULL query in the Filter clause (which honestly
>> surprises me because I had expected this "always true expression"
>> would have been optimized away by the planner).
>>
>> > EXPLAIN (VERBOSE, ANALYZE) SELECT id1, id2, id3 FROM test WHERE id1 = 1 AND (id2 IS NULL OR id2 IS NOT NULL) AND id3 = 101;
>> QUERY PLAN
>> ─────────────────────────────────────────────────────
>> Index Only Scan using test_idx on public.test (cost=0.42..12809.10
>> rows=1 width=12) (actual time=0.027..11.234 rows=1 loops=1)
>> Output: id1, id2, id3
>> Index Cond: ((test.id1 = 1) AND (test.id3 = 101))
>> Filter: ((test.id2 IS NULL) OR (test.id2 IS NOT NULL))
>>
>> > What about cases where we legitimately have to vary our strategy
>> > during the same index scan?
>>
>> Would my new suggestion above address this?
>>
>> > In fact, that'd make sense even today, without skip scan (just with
>> > the 17 work on nbtree SAOP scans). Even with regular SAOP nbtree index
>> > scans, the number of primitive scans is hard to predict, and quite
>> > indicative of what's really going on with the scan.
>>
>> *googles nbtree SAOP scans and finds the very helpful[1]*
>>
>> Yes, I feel like this should definitely be part of the ANALYZE output.
>> Seeing how Lukas has to look at pg_stat_user_tables to get this
>> information seems quite annoying[2] and only possible on systems that
>> have no concurrent queries.
>>
>> So it sounds like we'd want a "Primitive Index Scans" counter in
>> ANALYZE too. In addition to the number of filtered rows by, which if
>> we go with my proposal above should probably be called "Rows Removed
>> by Index Cond".
>
> This all just made me more confident that this shows a need to enable
> index AMs to provide output for EXPLAIN: The knowledge about how index
> scankeys are actually used is exclusively known to the index AM,
> because the strategies are often unique to the index AM (or even
> chosen operator classes), and sometimes can only be applied at
> runtime: while the index scankeys' sizes, attribute numbers and
> operators are known in advance (even if not all arguments are filled
> in; `FROM a JOIN b ON b.id = ANY (a.ref_array)`), the AM can at least
> show what strategy it is likely going to choose, and how (in)efficient
> that strategy could be.
>
> Right now, Masahiro-san's patch tries to do that with an additional
> field in IndexPath populated (by proxy) exclusively in btcostestimate.
> I think that design is wrong, because it wires explain- and
> btree-specific data through the planner, adding overhead everywhere
> which is only used for btree- and btree-compatible indexes.
>
> I think the better choice would be adding an IndexAmRoutine->amexplain
> support function, which would get called in e.g. explain.c's
> ExplainIndexScanDetails to populate a new "Index Scan Details" (name
> to be bikeshed) subsection of explain plans. This would certainly be
> possible, as the essentials for outputting things to EXPLAIN are
> readily available in the explain.h header.

Yes, that's one of my concerns. I agree to add IndexAmRoutine->amexplain
is better because we can support several use cases.

Although I'm not confident to add only IndexAmRoutine->amexplain is enough
now, I'll make a PoC patch to confirm it.

Regards,
--
Masahiro Ikeda
NTT DATA CORPORATION

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Masahiro.Ikeda 2024-07-01 02:55:54 RE: Improve EXPLAIN output for multicolumn B-Tree Index
Previous Message David Rowley 2024-07-01 02:52:42 Re: Should we document how column DEFAULT expressions work?