Geometric types row estimation

From: Igor ALBUQUERQUE SILVA <i(dot)albuquerque-silva(at)kayrros(dot)com>
To: pgsql-performance(at)lists(dot)postgresql(dot)org
Subject: Geometric types row estimation
Date: 2022-11-30 16:44:36
Message-ID: CADArKzXZnFixqw-GRkpwOasYzJRQOE7qH-6XgocG_njvM7EMmA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Hello everyone,

I'm having a problem regarding the point type/gist indexes. Here's a
minimal reproduction of it:

create table test(p point);
insert into test(p) values (point(0, 0));
insert into test(p) values (point(0, 1));
insert into test(p) values (point(1, 0));
insert into test(p) values (point(1, 1));
insert into test(p) values (point(50, 0));
analyze test;
explain analyze select * from test where p <@ box '(0,0),(1,1)';
explain analyze select * from test where p <@ box '(50,0),(51,1)';

The two queries get the same cost/row estimation, of 1 row. This is the
EXPLAIN ANALYZE of the first query:

Seq Scan on test (cost=0.00..1.07 rows=1 width=16) (actual
time=0.022..0.026 rows=4 loops=1)
Filter: ((p[0] >= '0'::double precision) AND (p[0] <= '1'::double
precision))
Rows Removed by Filter: 1
Planning Time: 0.115 ms
Execution Time: 0.055 ms
(5 rows)

What I was expecting is the first query to estimate 4 rows and the second
to estimate 1, like what I get If I try the same thing using integers.

create table test(x integer, y integer);
insert into test(x, y) values (0, 0);
insert into test(x, y) values (0, 1);
insert into test(x, y) values (1, 0);
insert into test(x, y) values (1, 1);
insert into test(x, y) values (50, 0);
analyze test;
explain analyze select * from test where x between 0 and 1 and y between 0
and 1;
explain analyze select * from test where x between 50 and 51 and y between
0 and 1;

My question is: is this expected behaviour? I actually have a much larger
table with a gist index where I found this occurring, and this causes the
planner to make bad decisions: every query that I do will have the same
estimation, and whenever this estimation is very wrong, the planner does
not take the optimal decision.

I'm using the official docker image, PostgreSQL 15.1 (Debian
15.1-1.pgdg110+1) on x86_64-pc-linux-gnu, compiled by gcc (Debian 10.2.1-6)
10.2.1 20210110, 64-bit, running everything in psql (PostgreSQL) 15.1
(Ubuntu 15.1-1.pgdg22.04+1).

Best regards,
Igor

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Igor ALBUQUERQUE SILVA 2022-11-30 16:46:58 Re: Geometric types row estimation
Previous Message Tom Lane 2022-11-30 16:36:28 Re: Catching up with performance & PostgreSQL 15