Re: Improving Performance of Query ~ Filter by A, Sort by B

From: Lincoln Swaine-Moore <lswainemoore(at)gmail(dot)com>
To: legrand legrand <legrand_legrand(at)hotmail(dot)com>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: Improving Performance of Query ~ Filter by A, Sort by B
Date: 2018-07-11 23:31:46
Message-ID: CABcidkLEZ=WifDQJLtPFaTeFLYTFkEbXDhbFLGVk5FzTVDSbbw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Thanks for looking into this!

Here's the result (I turned off the timeout and got it to finish):

EXPLAIN ANALYZE
SELECT "a"."id"
FROM a_partition1 "a"
WHERE "a"."parent_id" IN (
49188,14816,14758,8402
)
ORDER BY "a"."tmstmp" DESC
LIMIT 20;

QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=0.43..5710.03 rows=20 width=12) (actual
time=1141878.105..1142350.296 rows=20 loops=1)
-> Index Scan Backward using a_tmstmp_idx1 on a_partition1 a
(cost=0.43..1662350.21 rows=5823 width=12) (actual
time=1141878.103..1142350.274 rows=20 loops=1)
Filter: (parent_id = ANY ('{49188,14816,14758,8402}'::integer[]))
Rows Removed by Filter: 7931478
Planning time: 0.122 ms
Execution time: 1142350.336 ms
(6 rows)
(Note: I've chosen parent_ids that I know are associated with the part_key
1, but the query plan was the same with the 4 parent_ids in your query.)

Looks like it's using the filter in the same way as the query on the parent
table, so seems be a problem beyond the partitioning.

And as soon as I cut it back to 3 parent_ids, jumps to a query plan
using a_parent_id_idx1
again:

EXPLAIN ANALYZE
SELECT "a"."id"
FROM a_partition1 "a"
WHERE "a"."parent_id" IN (
19948,21436,41220
)
ORDER BY "a"."tmstmp" DESC
LIMIT 20;

QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=5004.57..5004.62 rows=20 width=12) (actual
time=36.329..36.341 rows=20 loops=1)
-> Sort (cost=5004.57..5015.49 rows=4367 width=12) (actual
time=36.328..36.332 rows=20 loops=1)
Sort Key: tmstmp DESC
Sort Method: top-N heapsort Memory: 26kB
-> Index Scan using a_parent_id_idx1 on a_partition1 a
(cost=0.43..4888.37 rows=4367 width=12) (actual time=5.581..36.270 rows=50
loops=1)
Index Cond: (parent_id = ANY
('{19948,21436,41220}'::integer[]))
Planning time: 0.117 ms
Execution time: 36.379 ms
(8 rows)

Thanks again for your help!

On Wed, Jul 11, 2018 at 5:41 PM, legrand legrand <
legrand_legrand(at)hotmail(dot)com> wrote:

> Hello,
>
> I have tested it with release 11 and limit 20 is pushed to each partition
> when using index on tmstmp.
>
> Could you tell us what is the result of your query applyed to one
> partition
>
> EXPLAIN ANALYZE
> SELECT "a"."id"
> FROM a_partition1 "a"
> WHERE "a"."parent_id" IN (
> 34226,24506,40987,27162
> )
> ORDER BY "a"."tmstmp" DESC
> LIMIT 20;
>
> May be that limit 20 is not pushed to partitions in your version ?
> Regards
> PAscal
>
>
>
>
>
> --
> Sent from: http://www.postgresql-archive.org/PostgreSQL-performance-
> f2050081.html
>
>

--
Lincoln Swaine-Moore

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Tom Lane 2018-07-12 03:10:18 Re: Improving Performance of Query ~ Filter by A, Sort by B
Previous Message legrand legrand 2018-07-11 21:41:55 Re: Improving Performance of Query ~ Filter by A, Sort by B