Re: Query with no result set, really really slow adding ORBDER BY / LIMIT clause

From: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To: Marco Renzi <renzi(dot)mrc(at)gmail(dot)com>
Cc: "pgsql-performance(at)postgresql(dot)org" <pgsql-performance(at)postgresql(dot)org>
Subject: Re: Query with no result set, really really slow adding ORBDER BY / LIMIT clause
Date: 2017-04-20 16:05:59
Message-ID: CAFj8pRDjB5P5=hDdK9yqZRxxoVF3P_hKfmBQBdvkRejMmYZn=A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

2017-04-20 17:57 GMT+02:00 Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>:

>
>
> 2017-04-20 9:19 GMT+02:00 Marco Renzi <renzi(dot)mrc(at)gmail(dot)com>:
>
>> Hi!, i've currently a big problem using ORBDER BY / LIMIT in a query
>> with no result set.
>> If i add the order by/limit clause it runs really really slow.
>>
>>
>>
>> QUERY 1 FAST:
>> --------------------------------
>>
>> SELECT fase.id
>> FROM tipofase
>> JOIN fase
>> ON (fase.tipofase = tipofase.id)
>> WHERE tipofase.agendafrontoffice = true
>>
>> EXPLAIN ANALYZE:
>>
>> Nested Loop (cost=0.43..790.19 rows=14462 width=4) (actual time=0.079..0.079 rows=0 loops=1)
>>
>> -> Seq Scan on tipofase (cost=0.00..3.02 rows=1 width=4) (actual time=0.077..0.077 rows=0 loops=1)
>> Filter: agendafrontoffice
>> Rows Removed by Filter: 102
>> -> Index Only Scan using fase_test_prova_4 on fase (cost=0.43..595.59 rows=19158 width=8) (never executed)
>> Index Cond: (tipofase = tipofase.id)
>> Heap Fetches: 0
>> Planning time: 0.669 ms
>> Execution time: 0.141 ms
>>
>> ---
>>
>> It's perfect because it starts from tipofase, where there are no agendafrontoffice = true
>>
>> fase_test_prova_4 is a btree index ON (fase.tipofase, fase.id)
>> fase.id is PRIMARY key on fase,
>> tipofase.id is PRIMARY key on tipofase,
>> fase.tipofase is FK on tipofase.id
>> and tipofase.agendafrontoffice is a boolean.
>>
>> I've also created a btree index on tipofase.agendafrontoffice.
>>
>> **fase** is a large table with 1.475.146 records. There are no rows in
>> the table matching tipofase.agendafrontoffice = true, so the result set is
>> empty(QUERY 1)
>>
>>
>>
>>
>> QUERY 2 SLOW(WITH limit and order by):
>> --------------------------------
>>
>>
>> SELECT fase.id
>> FROM tipofase
>> JOIN fase
>> ON (fase.tipofase = tipofase.id)
>> WHERE tipofase.agendafrontoffice = true
>> ORDER BY fase.id DESC limit 10 offset 0
>>
>> Limit (cost=0.43..149.66 rows=10 width=4) (actual time=173853.131..173853.131 rows=0 loops=1)
>> -> Nested Loop (cost=0.43..215814.25 rows=14462 width=4) (actual time=173853.130..173853.130 rows=0 loops=1)
>> Join Filter: (fase.tipofase = tipofase.id)
>> -> Index Scan Backward using test_prova_2 on fase (cost=0.43..193684.04 rows=1475146 width=8) (actual time=1.336..173128.418 rows=1475146 loops=1)
>> -> Materialize (cost=0.00..3.02 rows=1 width=4) (actual time=0.000..0.000 rows=0 loops=1475146)
>> -> Seq Scan on tipofase (cost=0.00..3.02 rows=1 width=4) (actual time=0.000..0.000 rows=0 loops=1)
>> Filter: agendafrontoffice
>> Rows Removed by Filter: 102
>> Planning time: 0.685 ms
>> Execution time: 173853.221 ms
>>
>>
>>
> I am afraid so is not possible to solve this issue by one query. In this
> case the planner expects early stop due finding few values. But because
> there are not any value, the LIMIT clause has not any benefit in executor
> time, but the planner is messed. Maybe try to increase LIMIT to some higher
> value .. 1000, 10000 so planner don't fall to this trap. PostgreSQL
> statistics are about most common values, but the values without any
> occurrence are not well registered by statistics.
>
> Regards
>

It can looks strange, but it can work

SELECT *
FROM (your query ORDER BY .. OFFSET 0 LIMIT 10000) s
ORDER BY ...
LIMIT 10;

Regards

Pavel

>
> Pavel
>
>
>> Really really slow..... looks like the planner is not doing a good job.
>> PostgreSQL 9.4.1, compiled by Visual C++ build 1800, 64-bit
>>
>>
>> I also run VACUUM AND VACUUM ANALYZE on both table
>> I tried to play with the
>> "alter table tipofase alter column agendafrontoffice set statistics 2"
>> but nothing.
>>
>> Thanks in advance Marco
>>
>>
>>
>> --
>> ------------------------------------------------------------
>> ------------------------------------------------------------
>> -------------------
>> Ing. Marco Renzi
>> OCA - Oracle Certified Associate Java SE7 Programmer
>> OCP - Oracle Certified Mysql 5 Developer
>>
>> via Zegalara 57
>> 62014 Corridonia(MC)
>> Mob: 3208377271 <(320)%20837-7271>
>>
>>
>> "The fastest way to change yourself is to hang out with people who are
>> already the way you want to be" Reid Hoffman
>>
>
>

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Marco Renzi 2017-04-21 06:49:54 Re: Query with no result set, really really slow adding ORBDER BY / LIMIT clause
Previous Message Pavel Stehule 2017-04-20 15:57:02 Re: Query with no result set, really really slow adding ORBDER BY / LIMIT clause