Re: Limit Heap Fetches / Rows Removed by Filter in Index Scans

From: Francisco Olarte <folarte(at)peoplecall(dot)com>
To: Victor Blomqvist <vb(at)viblo(dot)se>
Cc: Postgres General <pgsql-general(at)postgresql(dot)org>
Subject: Re: Limit Heap Fetches / Rows Removed by Filter in Index Scans
Date: 2016-08-19 10:01:49
Message-ID: CA+bJJbx-nbJP3UQy1YbSOBK9iBgw6CAEchw58yzv_RS6c2rKVA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Hi Victor:

On Fri, Aug 19, 2016 at 7:06 AM, Victor Blomqvist <vb(at)viblo(dot)se> wrote:
> Is it possible to break/limit a query so that it returns whatever results
> found after having checked X amount of rows in a index scan?
>
> For example:
> create table a(id int primary key);
> insert into a select * from generate_series(1,100000);
>
> select * from a
> where id%2 = 0
> order by id limit 10
>
> In this case the query will "visit" 20 rows and filter out 10 of them. We
> can see that in the query plan:
> "Rows Removed by Filter: 10"
> "Heap Fetches: 20"
>
> Is it somehow possible to limit this query so that it only fetches X amount,
> in my example if we limited it to 10 Heap Fetches the query could return the
> first 5 rows?
>
>
> My use case is I have a table with 35 million rows with a geo index, and I
> want to do a KNN search but also limit the query on some other parameters.
> In some cases the other parameters restrict the query so much that Heap
> Fetches becomes several 100k or more, and in those cases I would like to
> have a limit to my query.

Well, if you accept more abstract limits (i.e. you do not depend on
heap fetches, you just want up to 5 even IDs from the first 10 IDs )
you could try:

with base as (select * from a order by id limit 10)
select * from base where id %2 = 0 order by id limit 5;

( Or do it with a subquery instead of a CTE ).

In general

select * from table where common_condition and filter_condition order
by xx limit N

becomes

with base as (select * from table where common_condition order by xx
limit base_fecthes)
select * from base where filter_condition order by XX limit N;

In the example common_condition is non existent, put it as true,
optimize after transforming.

Francisco Olarte.

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Thomas Güttler 2016-08-19 10:06:05 Re: PG vs ElasticSearch for Logs
Previous Message Sameer Kumar 2016-08-19 09:21:43 Re: PG vs ElasticSearch for Logs