Quick Links

Re: Consider the number of columns in the sort cost model

From:	Alena Rybakina <a(dot)rybakina(at)postgrespro(dot)ru>
To:	Andrei Lepikhov <lepihov(at)gmail(dot)com>
Cc:	Kirill Reshke <reshkekirill(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, David Rowley <dgrowleyml(at)gmail(dot)com>
Subject:	Re: Consider the number of columns in the sort cost model
Date:	2024-10-31 15:18:46
Message-ID:	ac3b7851-188a-4609-8d81-68ba03dae55d@postgrespro.ru
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

> I played around with the examples a bit and couldn't figure out
> something. When I added the same values to different columns -
> firstly in a, later in b, the order of the columns for sort operation
> doesn't change. Isn't this a mistake?
>
> create table a (x1 int, y1 int);
> create table b (x2 int, y2 int);
> insert into a values (NULL, NULL);
> insert into a values (NULL, 1);
> insert into a values (1, 1);
> insert into a values (1, NULL);
>
> create index a_x1_idx on a(x1);
> create index b_x2_idx on b(x2);
> create index a_y1_idx on a(y1);
> create index b_y2_idx on b(y2);
>
> insert into b select 1, 2 from generate_series(11,20) as id;
> insert into b select 1, 1 from generate_series(1,10) as id;
> insert into b select 1, 3 from generate_series(3,30) as id;
>
> explain analyze select a.x1, s.x2, s.y2 from a left join (select
> distinct * from b) s on a.x1=s.x2;
>                                                  QUERY PLAN
> ------------------------------------------------------------------------------------------------------------
> Hash Right Join (cost=44.99..48.15 rows=5 width=12) (actual
> time=0.225..0.250 rows=8 loops=1)
>    Hash Cond: (b.x2 = a.x1)
>    -> HashAggregate (cost=43.90..46.16 rows=226 width=8) (actual
> time=0.117..0.123 rows=3 loops=1)
>          Group Key: b.x2, b.y2
>          Batches: 1 Memory Usage: 40kB
>          -> Seq Scan on b (cost=0.00..32.60 rows=2260 width=8)
> (actual time=0.030..0.044 rows=48 loops=1)
>    -> Hash (cost=1.04..1.04 rows=4 width=4) (actual
> time=0.073..0.074 rows=4 loops=1)
>          Buckets: 1024 Batches: 1 Memory Usage: 9kB
>          -> Seq Scan on a (cost=0.00..1.04 rows=4 width=4) (actual
> time=0.047..0.051 rows=4 loops=1)
> Planning Time: 1.649 ms
> Execution Time: 0.485 ms
> (11 rows)
>
> delete from b;
> insert into b select 2, 1 from generate_series(11,20) as id;
> insert into b select 1, 1 from generate_series(1,10) as id;
> insert into b select 3, 1 from generate_series(3,30) as id;
> vacuum analyze;
> explain analyze select a.x1, s.x2, s.y2 from a left join (select
> distinct * from b) s on a.x1=s.x2;
>                                                   QUERY PLAN
> ---------------------------------------------------------------------------------------------------------------
> Hash Left Join (cost=1.79..2.86 rows=4 width=12) (actual
> time=0.083..0.090 rows=4 loops=1)
>    Hash Cond: (a.x1 = b.x2)
>    -> Seq Scan on a (cost=0.00..1.04 rows=4 width=4) (actual
> time=0.010..0.011 rows=4 loops=1)
>    -> Hash (cost=1.75..1.75 rows=3 width=8) (actual
> time=0.067..0.068 rows=3 loops=1)
>          Buckets: 1024 Batches: 1 Memory Usage: 9kB
>          -> HashAggregate (cost=1.72..1.75 rows=3 width=8) (actual
> time=0.063..0.064 rows=3 loops=1)
>                Group Key: b.x2, b.y2
>                Batches: 1 Memory Usage: 24kB
>                -> Seq Scan on b (cost=0.00..1.48 rows=48 width=8)
> (actual time=0.006..0.014 rows=48 loops=1)
> Planning Time: 0.391 ms
> Execution Time: 0.151 ms
> (11 rows)

Sorry, I missed vacuum analyze before deleting all data from table b,
but after running it I still got the same plan.

alena(at)postgres=# create table a (x1 int, y1 int);
create table b (xcreate table a (x1 int, y1 int);
create table b (x2 int, y2 int););
insert into a values (NULL, NULL);
insert into a values (NULL, 1);
insert into a values (1, 1);L);
insert into a values (1, NULL);
create index a_x1_idx on a(x1);
create index a_x1_idx on a(x1);
create index b_x2_idx on b(x2);
create index a_y1_idx on a(y1);
create index b_y2_idx on b(y2);
insert into b select 1, 2 from generate_series(11,20) as id;
insert into b select 1, 2 from generate_series(11,20) as id;
insert into b select 1, 1 from generate_series(1,10) as id;
insert into b select 1, 3 from generate_series(3,30) as id;

alena(at)postgres=# vacuum analyze;
VACUUM
alena(at)postgres=# explain analyze select a.x1, s.x2, s.y2 from a left
join (select distinct * from b) s on a.x1=s.x2;
                                                  QUERY PLAN
---------------------------------------------------------------------------------------------------------------
Hash Left Join (cost=1.79..2.86 rows=4 width=12) (actual
time=0.168..0.185 rows=8 loops=1)
   Hash Cond: (a.x1 = b.x2)
   -> Seq Scan on a (cost=0.00..1.04 rows=4 width=4) (actual
time=0.027..0.029 rows=4 loops=1)
   -> Hash (cost=1.75..1.75 rows=3 width=8) (actual time=0.129..0.130
rows=3 loops=1)
         Buckets: 1024 Batches: 1 Memory Usage: 9kB
         -> HashAggregate (cost=1.72..1.75 rows=3 width=8) (actual
time=0.119..0.123 rows=3 loops=1)
               Group Key: b.x2, b.y2
               Batches: 1 Memory Usage: 24kB
               -> Seq Scan on b (cost=0.00..1.48 rows=48 width=8)
(actual time=0.013..0.029 rows=48 loops=1)
Planning Time: 1.464 ms
Execution Time: 0.352 ms
(11 rows)

--
Regards,
Alena Rybakina
Postgres Professional

In response to

Re: Consider the number of columns in the sort cost model at 2024-10-31 14:00:09 from Alena Rybakina

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Melanie Plageman	2024-10-31 15:25:49	Re: Count and log pages set all-frozen by vacuum
Previous Message	Peter Geoghegan	2024-10-31 15:15:16	Re: Count and log pages set all-frozen by vacuum