From: | Victor Blomqvist <vb(at)viblo(dot)se> |
---|---|
To: | Francisco Olarte <folarte(at)peoplecall(dot)com> |
Cc: | Postgres General <pgsql-general(at)postgresql(dot)org> |
Subject: | Re: Limit Heap Fetches / Rows Removed by Filter in Index Scans |
Date: | 2016-08-19 18:01:57 |
Message-ID: | CAL870DVEaNJkFv7bk5EnsZ3vDdP2DQe8iLgLfqJpJi_rk-0yHA@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
On Sat, Aug 20, 2016 at 1:13 AM, Francisco Olarte <folarte(at)peoplecall(dot)com>
wrote:
> Hi Victor:
>
>
> On Fri, Aug 19, 2016 at 7:02 PM, Victor Blomqvist <vb(at)viblo(dot)se> wrote:
> > What I want to avoid is my query visiting the whole 1m rows to get a
> result,
> > because in my real table that can take 100sec. At the same time I want
> the
> > queries that only need to visit 1k rows finish quickly, and the queries
> that
> > visit 100k rows at least get some result back.
>
> You are going to have problems with that. If you just want to limit it
> to max 100k rows, max 10 results my solution works, probably better as
> nested selects than CTEs, but someone more knowledgeable in the
> optimizer will need to say something ( or tests will be needed ). This
> is because "the queries that visit 100k rows at least get some result
> back." may be false, you may need to visit the whole 1M to get the
> first result if you are unlucky. Just set ap=999 where id=1M and ask
> for ap>=999 and you've got that degenerate case, which can only be
> saved if you have an index on ap ( even with statistics, you would
> need a full table scan to find it ).
>
> If you are positive some results are in the first 100k rows, then my
> method works fine, how fast will need to be tested with the real data.
> You can even try using *10, *100, *1k of the real limit until you have
> enough results if you want to time-limit your queries.
>
>
> Francisco Olarte.
>
Thanks! A sub select seems to do it.
I didnt think of it before, guess I got blinded by the CTE since usually
its the other way around and the CTE is the answer to the problem.
But seems like the easy solution with a good old sub select fixes it.
Now I feel a bit embarrassed for such a easy answer :)
Checking these two queries I can see that the first one visits the
max 50 rows its allowed to and returns 5 rows, while the second one
finish off after 13 rows fetched and returns the full 10 rows.
select *
from (select * from b order by id limit 50) x
where age_preference%10 < 1
order by id limit 10
select *
from (select * from b order by id limit 50) x
where age_preference%10 < 5
order by id limit 10
/Victor
From | Date | Subject | |
---|---|---|---|
Next Message | Chris Mair | 2016-08-19 20:40:04 | Re: PG vs ElasticSearch for Logs |
Previous Message | Andy Colson | 2016-08-19 17:59:47 | Re: PG vs ElasticSearch for Logs |