Re: Geometric types row estimation

From: Igor ALBUQUERQUE SILVA <i(dot)albuquerque-silva(at)kayrros(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-performance(at)lists(dot)postgresql(dot)org
Subject: Re: Geometric types row estimation
Date: 2022-11-30 17:38:16
Message-ID: CADArKzWtKs8ax0wUUa1RUC0APfXbF6WZREhOtir-tBYE8b1Grw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Hi Tom,

Thanks a lot for the explanation, I thought the built-in types were more
standard, so I didn't mention that I was having the same thing using
postgis. Here's the example (I changed the values a little bit to avoid
rounding errors):

create table test(p geometry(point));
insert into test(p) values (st_makepoint(0,0));
insert into test(p) values (st_makepoint(0,1));
insert into test(p) values (st_makepoint(1,0));
insert into test(p) values (st_makepoint(1,1));
insert into test(p) values (st_makepoint(50,0));
analyze test;
explain analyze select * from test where
ST_Contains(ST_GeomFromText('POLYGON((-1 -1,2 -1,2 2,-1 2,-1 -1))'), p);
explain analyze select * from test where
ST_Contains(ST_GeomFromText('POLYGON((49 -1,51 -1,51 1,49 1,49 -1))'), p);

EXPLAIN ANALYZE:

Seq Scan on test (cost=0.00..126.05 rows=1 width=32) (actual
time=0.015..0.022 rows=4 loops=1)
Filter:
st_contains('01030000000100000005000000000000000000F0BF000000000000F0BF0000000000000040000000000000F0BF00000000000000400000000000000040000000000000F0BF0000000000000040000000000000F0BF000000000000F0BF'::geometry,
p)
Rows Removed by Filter: 1
Planning Time: 0.072 ms
Execution Time: 0.035 ms
(5 rows)

Do you know if the functions in Postgis are also stubbed? Or maybe I'm
doing something wrong with the syntax?

This time I'm using the postgis docker image, PostgreSQL 15.1 (Debian
15.1-1.pgdg110+1) on x86_64-pc-linux-gnu, compiled by gcc (Debian 10.2.1-6)
10.2.1 20210110, 64-bit

Best regards,
Igor

On Wed, 30 Nov 2022 at 18:18, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:

> Igor ALBUQUERQUE SILVA <i(dot)albuquerque-silva(at)kayrros(dot)com> writes:
> > I'm having a problem regarding the point type/gist indexes. Here's a
> > minimal reproduction of it:
> > ...
> > What I was expecting is the first query to estimate 4 rows and the second
> > to estimate 1, like what I get If I try the same thing using integers.
>
> Unfortunately, the selectivity estimation functions for PG's geometric
> types are mostly just stubs. The estimation function for point <@ box
> in particular is contsel [1]:
>
> /*
> * contsel -- How likely is a box to contain (be contained by) a
> given box?
> *
> * This is a tighter constraint than "overlap", so produce a smaller
> * estimate than areasel does.
> */
> Datum
> contsel(PG_FUNCTION_ARGS)
> {
> PG_RETURN_FLOAT8(0.001);
> }
>
> It's been like that (excepting notational changes) since Berkeley days,
> because nobody has bothered to make it better.
>
> In general, PG's built-in geometric types have never gotten much
> beyond their origins as an academic proof-of-concept. I think people
> who are doing serious work that requires such operations mostly use
> PostGIS, and I'd suggest looking into that.
>
> Or, if you feel like doing a lot of work to make these estimators
> better, have at it.
>
> regards, tom lane
>
> [1]
> https://git.postgresql.org/gitweb/?p=postgresql.git;a=blob_plain;f=src/backend/utils/adt/geo_selfuncs.c;hb=HEAD
>

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Tom Lane 2022-11-30 17:45:47 Re: Geometric types row estimation
Previous Message Tom Lane 2022-11-30 17:18:41 Re: Geometric types row estimation