Fw: Re: Bitmap scan seem like such a strange choice when "limit 1"

From: Klaudie Willis <Klaudie(dot)Willis(at)protonmail(dot)com>
To: PostgreSQL General <pgsql-general(at)lists(dot)postgresql(dot)org>
Subject: Fw: Re: Bitmap scan seem like such a strange choice when "limit 1"
Date: 2020-09-07 07:09:42
Message-ID: 5FS8XCRMtAJt2DHVsOLfyavYQtv_pUq57wLsG2X_-ialvKKS_pRohzYn5TINZPIXkn_LXxxGpXnSfnePD5Uydp8xajiW4fCCnlpx7GgwWXI=@protonmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

> t > '2020-01-01'::timestamp and t < '2021-01-01'::timestamp
>Not at all important, but it seems odd to be exclusive of the start and end both. I would >consider including the start with >=
>Michael Lewis | Database Engineer
>Entrata

Michael, funny I was thinking that myself minutes after posting. Perhaps it is that tiny gap that makes a difference; however changing it to t >= '2020....etc' and perfectly matching the partition range, did not change anything of significance in the explain or runtime. :-|

On that other topic, n_distinct, it is for the moment indeed hardcoded to -0,1. I have tried to reset n_distinct, and run analyze with default_target_statistics = 2000; no dice!
However, the cars_ref in question, is present in the most_common_vals of pg_stats, and according to that frequency array, that value occurs with a frequency of 1,7%. That seems correct.

select count(*)
from bigtablet
where cars_ref = 1769854207
and t >= '2020-01-01'::timestamp and t < '2021-01-01'::timestamp;
--> 2 817 169

I can add that car_ref in general is quite skewed in its distribution, but I don't think that is the issue here.
I think the key hint is that when targeting the partition child table directly, the plan changes. See below for "proof"

explain (analyze,buffers)
select *
from bigtable
where car_ref = 1769854207 and t >= '2020-01-01'::timestamp and t < '2021-01-01'::timestamp
limit 1

Limit (cost=24961.76..24962.67 rows=1 width=636) (actual time=1456.315..1456.316 rows=1 loops=1)
Buffers: shared hit=2377
-> Bitmap Heap Scan on bigtable_y2020 bigtable (cost=24961.76..2640351.94 rows=2874279 width=636) (actual time=1456.313..1456.314 rows=1 loops=1)
Recheck Cond: (car_ref = 1769854207)
Filter: ((t >= '2020-01-01 00:00:00'::timestamp without time zone) AND (t < '2021-01-01 00:00:00'::timestamp without time zone))
Heap Blocks: exact=1
Buffers: shared hit=2377
-> Bitmap Index Scan on bigtable_2020_ref_index (cost=0.00..24243.19 rows=2874336 width=0) (actual time=721.428..721.428 rows=2817169 loops=1)
Index Cond: (car_ref = 1769854207)
Buffers: shared hit=2376
Planning Time: 0.321 ms
Execution Time: 1480.087 ms

explain (analyze,buffers)
select *
from bigtable_y2020 tt
where car_ref = 1769854207 and t >= '2020-01-01'::timestamp and t < '2021-01-01'::timestamp
limit 1

Limit (cost=0.57..1.60 rows=1 width=636) (actual time=0.037..0.038 rows=1 loops=1)
Buffers: shared hit=5
-> Index Scan using bigtable_2020_ref_index on bigtable_y2020 tt (cost=0.57..2967225.58 rows=2874279 width=636) (actual time=0.036..0.036 rows=1 loops=1)
Index Cond: (car_ref = 1769854207)
Filter: ((t >= '2020-01-01 00:00:00'::timestamp without time zone) AND (t < '2021-01-01 00:00:00'::timestamp without time zone))
Buffers: shared hit=5
Planning Time: 0.349 ms
Execution Time: 0.106 ms

best regards
K

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Matthias Apitz 2020-09-07 09:22:47 compatibility matrix between client and server
Previous Message Klaudie Willis 2020-09-07 07:06:53 Fw: Re: Bitmap scan seem like such a strange choice when "limit 1"