Quick Links

Re: Limit Heap Fetches / Rows Removed by Filter in Index Scans

From:	Victor Blomqvist <vb(at)viblo(dot)se>
To:	Francisco Olarte <folarte(at)peoplecall(dot)com>
Cc:	Postgres General <pgsql-general(at)postgresql(dot)org>
Subject:	Re: Limit Heap Fetches / Rows Removed by Filter in Index Scans
Date:	2016-08-19 18:01:57
Message-ID:	CAL870DVEaNJkFv7bk5EnsZ3vDdP2DQe8iLgLfqJpJi_rk-0yHA@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Thread:
Lists:	pgsql-general

On Sat, Aug 20, 2016 at 1:13 AM, Francisco Olarte <folarte(at)peoplecall(dot)com>
wrote:

> Hi Victor:
>
>
> On Fri, Aug 19, 2016 at 7:02 PM, Victor Blomqvist <vb(at)viblo(dot)se> wrote:
> > What I want to avoid is my query visiting the whole 1m rows to get a
> result,
> > because in my real table that can take 100sec. At the same time I want
> the
> > queries that only need to visit 1k rows finish quickly, and the queries
> that
> > visit 100k rows at least get some result back.
>
> You are going to have problems with that. If you just want to limit it
> to max 100k rows, max 10 results my solution works, probably better as
> nested selects than CTEs, but someone more knowledgeable in the
> optimizer will need to say something ( or tests will be needed ). This
> is because "the queries that visit 100k rows at least get some result
> back." may be false, you may need to visit the whole 1M to get the
> first result if you are unlucky. Just set ap=999 where id=1M and ask
> for ap>=999 and you've got that degenerate case, which can only be
> saved if you have an index on ap ( even with statistics, you would
> need a full table scan to find it ).
>
> If you are positive some results are in the first 100k rows, then my
> method works fine, how fast will need to be tested with the real data.
> You can even try using *10, *100, *1k of the real limit until you have
> enough results if you want to time-limit your queries.
>
>
> Francisco Olarte.
>

Thanks! A sub select seems to do it.

I didnt think of it before, guess I got blinded by the CTE since usually
its the other way around and the CTE is the answer to the problem.
But seems like the easy solution with a good old sub select fixes it.
Now I feel a bit embarrassed for such a easy answer :)

Checking these two queries I can see that the first one visits the
max 50 rows its allowed to and returns 5 rows, while the second one
finish off after 13 rows fetched and returns the full 10 rows.

select *
from (select * from b order by id limit 50) x
where age_preference%10 < 1
order by id limit 10

select *
from (select * from b order by id limit 50) x
where age_preference%10 < 5
order by id limit 10

/Victor

In response to

Re: Limit Heap Fetches / Rows Removed by Filter in Index Scans at 2016-08-19 17:13:44 from Francisco Olarte

Responses

Re: Limit Heap Fetches / Rows Removed by Filter in Index Scans at 2016-08-20 09:21:42 from Francisco Olarte

Browse pgsql-general by date

	From	Date	Subject
Next Message	Chris Mair	2016-08-19 20:40:04	Re: PG vs ElasticSearch for Logs
Previous Message	Andy Colson	2016-08-19 17:59:47	Re: PG vs ElasticSearch for Logs