Re: PostgreSQL 12.3 slow index scan chosen

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Kenneth Marshall <ktm(at)rice(dot)edu>
Cc: pgsql-performance(at)lists(dot)postgresql(dot)org
Subject: Re: PostgreSQL 12.3 slow index scan chosen
Date: 2020-06-19 23:07:13
Message-ID: 1762066.1592608033@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

[ please keep the mailing list cc'd ]

Kenneth Marshall <ktm(at)rice(dot)edu> writes:
> Here are the stats for articles.id:

> 4,7,9,11,13,14,16,17,18,19,20,21,22,23,
> 24,25,26,32,33,34,36,40,41,42,43,44,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,99,100,101,102,106,107,108,109,113,1 14,115,116,117,118,119,120,121,122,123,125,126,127,128,129,130,131,133,134,135,136,137,140,141,142,143,144,145,146,14 7,148,149,150,151,152,153,155,156,157,158,159,160,161,162,163,164,165,166,167,168,169,170,171,172,173,174,175,176,177 ,178,179,180,181,182,183,184,185,186,187,188,189,190,191,192,193,194,195,196,197,198,199,200,201,202,203,204,205,206, 207,208,1209,1210,1212,1213,1214,1215,1216,1219,1220,1221,1222,1223}
> That completely matches the max(id) for articles.id.

Hm, well it's clear why the planner is going for the mergejoin strategy:
it expects to only have to scan a very small fraction of the other table
before it's up past objectid = 1223 and can stop merging. And it
seems like it's right ...

... oh, now I see: apparently, your filter condition is such that *no*
rows of the objectcustomfieldvalues table get past the filter:

-> Index Scan using objectcustomfieldvalues3 on objectcustomfieldvalues objectcustomfieldvalues_1 (cost=0.56..807603.40 rows=915 width=4) (actual time=21165.441..21165.441 rows=0 loops=1)
Filter: ((disabled = 0) AND ((largecontent ~~* '%958575%'::text) OR ((content)::text ~~* '%958575%'::text)))
Rows Removed by Filter: 19030904

"rows=0" is the telltale. So even after we're past objectid = 1223, that
scan continues, because the mergejoin needs to see a higher key before it
knows it can stop.

That's kind of annoying :-(. I wonder if there's a way to be smarter?
This case would work a lot better if the filter conditions were not
applied till after the merge; but of course that wouldn't be an
improvement in general. Or maybe we should penalize the mergejoin
cost estimate if there's a highly selective filter in the way.
(It does look like the planner is correctly estimating that the
filter is quite selective --- it's just not considering the potential
impact on the scan-until-finding-a-greater-key behavior.)

Right now I don't have any better suggestion than disabling mergejoin
if you think the filter is going to be very selective.

regards, tom lane

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Tom Lane 2020-06-20 18:22:03 Re: PostgreSQL 12.3 slow index scan chosen
Previous Message Kenneth Marshall 2020-06-19 22:37:16 Re: PostgreSQL 12.3 slow index scan chosen