From: | Justin Pryzby <pryzby(at)telsasoft(dot)com> |
---|---|
To: | Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com> |
Cc: | Andres Freund <andres(at)anarazel(dot)de>, Greg Stark <stark(at)mit(dot)edu>, Matthias van de Meent <boekewurm+postgres(at)gmail(dot)com>, Zhihong Yu <zyu(at)yugabyte(dot)com>, pgsql-hackers(at)lists(dot)postgresql(dot)org |
Subject: | Re: PATCH: Using BRIN indexes for sorted output |
Date: | 2023-02-18 18:51:09 |
Message-ID: | 20230218185109.GH1653@telsasoft.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Are (any of) these patches targetting v16 ?
typos:
ar we - we are?
morestly - mostly
interstect - intersect
> + * XXX We don't sort the bins, so just do binary sort. For large number of values
> + * this might be an issue, for small number of values a linear search is fine.
"binary sort" is wrong?
> + * only half of there ranges, thus 1/2. This can be extended to randomly
half of *these* ranges ?
> From 7b3307c27b35ece119feab4891f03749250e454b Mon Sep 17 00:00:00 2001
> From: Tomas Vondra <tomas(dot)vondra(at)postgresql(dot)org>
> Date: Mon, 17 Oct 2022 18:39:28 +0200
> Subject: [PATCH 01/11] Allow index AMs to build and use custom statistics
I think the idea can also apply to btree - currently, correlation is
considered to be a property of a column, but not an index. But that
fails to distinguish between a freshly built index, and an index with
out of order heap references, which can cause an index scan to be a lot
more expensive.
I implemented per-index correlation stats way back when:
https://www.postgresql.org/message-id/flat/20160524173914.GA11880%40telsasoft.com
See also:
https://www.postgresql.org/message-id/14438.1512499811@sss.pgh.pa.us
With my old test case:
Index scan is 3x slower than bitmap scan, but index scan is costed as
being cheaper:
postgres=# explain analyze SELECT * FROM t WHERE i>11 AND i<55;
Index Scan using t_i_idx on t (cost=0.43..21153.74 rows=130912 width=8) (actual time=0.107..222.737 rows=128914 loops=1)
postgres=# SET enable_indexscan =no;
postgres=# explain analyze SELECT * FROM t WHERE i>11 AND i<55;
Bitmap Heap Scan on t (cost=2834.28..26895.96 rows=130912 width=8) (actual time=16.830..69.860 rows=128914 loops=1)
If it's clustered, then the index scan is almost twice as fast, and the
costs are more consistent with the associated time. The planner assumes
that the indexes are freshly built...
postgres=# CLUSTER t USING t_i_idx ;
postgres=# explain analyze SELECT * FROM t WHERE i>11 AND i<55;
Index Scan using t_i_idx on t (cost=0.43..20121.74 rows=130912 width=8) (actual time=0.084..117.549 rows=128914 loops=1)
--
Justin
From | Date | Subject | |
---|---|---|---|
Next Message | Tomas Vondra | 2023-02-18 19:49:46 | Re: BRIN indexes vs. SK_SEARCHARRAY (and preprocessing scan keys) |
Previous Message | Peter Eisentraut | 2023-02-18 18:09:34 | Re: [PATCH] Add pretty-printed XML output option |