Re: On disable_cost

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: "Jonathan S(dot) Katz" <jkatz(at)postgresql(dot)org>, David Rowley <dgrowleyml(at)gmail(dot)com>, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, Peter Geoghegan <pg(at)bowt(dot)ie>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, andrew(at)ankane(dot)org
Subject: Re: On disable_cost
Date: 2024-08-23 17:26:26
Message-ID: 2267696.1724433986@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Robert Haas <robertmhaas(at)gmail(dot)com> writes:
> On Fri, Aug 23, 2024 at 11:17 AM Jonathan S. Katz <jkatz(at)postgresql(dot)org> wrote:
>> We hit an issue with pgvector[0] where a regular `SELECT count(*) FROM
>> table`[1] is attempting to scan the index on the vector column when
>> `enable_seqscan` is disabled. Credit to Andrew Kane (CC'd) for flagging it.

> It took me a moment to wrap my head around this: the cost estimate is
> 312 decimal digits long. Apparently hnswcostestimate() just returns
> DBL_MAX when there are no scan keys because it really, really doesn't
> want to do that. Before e2225346, that kept this plan from being
> generated because it was (much) larger than disable_cost. But now it
> doesn't, because 1 disabled node makes a path more expensive than any
> possible non-disabled path. Since that was the whole point of the
> patch, I don't feel too bad about it.

Yeah, I don't think it's necessary for v18 to be bug-compatible with
this hack.

> If you don't want to fix hnsw to work the way the core optimizer
> thinks it should, or if there's some reason it can't be done,
> alternatives might include (1) having the cost estimate function hack
> the count of disabled nodes and (2) adding some kind of core support
> for an index cost estimator refusing a path entirely. I haven't tested
> (1) so I don't know for sure that there are no issues, but I think we
> have to do all of our cost estimating before we can think about adding
> the path so I feel like there's a decent chance it would do what you
> want.

It looks like amcostestimate could change the path's disabled_nodes
count, since that's set up before invoking amcostestimate. I guess
it could be set to INT_MAX to have a comparable solution to before.

I agree with you that it is not great that hnsw is refusing this case
rather than finding a way to make it work, so I'm not excited about
putting in support for refusing it in a less klugy way.

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2024-08-23 17:37:53 Re: On disable_cost
Previous Message Robert Haas 2024-08-23 17:11:45 Re: On disable_cost