Re: Optimizing count(), but Explain estimates wildly off

From: Chema <chema(at)interneta(dot)org>
To: Greg Sabino Mullane <htamfids(at)gmail(dot)com>
Cc: pgsql-performance(at)lists(dot)postgresql(dot)org
Subject: Re: Optimizing count(), but Explain estimates wildly off
Date: 2024-03-06 14:00:00
Message-ID: CALdEsqNw4wNaZ_5B_Y4ZaSHp7FHVEUvcrWsNf5oW5o4Cp=gt4w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

>
> Yours will be different, as I cannot exactly duplicate your schema or data
> distribution, but give "SELECT 1" a try. This was on Postgres 16, FWIW,
> with a default_statistics_target of 100.
>

Select 1 produces a sequential scan, like Select * did before Vacuum Full.
But if I force an index scan with the subquery hack, there's a significant
improvement over Select *. Row estimate is still -50%|200%, so seems it's
only accurate for very simple queries indeed. In conclusion, I'll just
keep on count(*)ing with the subquery hack. Funny thing, Select 1 is
slightly faster than Select count(*), so I'm tempted to do Select count(*)
From (Select 1...) As iluffsubqueries. xD

(pg_roaringbitmap <https://github.com/ChenHuajun/pg_roaringbitmap> looks
great, but I expect it works with fixed categories, while I have several
full text search columns)

-- With previous country,date index
query x 100 | avg | min |
q1 | median | q3 | max
------------------------+--------------------+-------------------+-------------------+--------------------+-------------------+--------------------
Count Colombia | 9093.918731212616 | 6334.060907363892 |
7366.191983222961 | 9154.448866844177 | 10276.342272758484 |
13520.153999328613
Subquery Colombia | 7926.021897792816 | 5926.224946975708 |
7000.077307224274 | 7531.211018562317 | 8828.327298164368 |
11380.73992729187
Sel* Colombia | 8694.387829303741 | 6963.425874710083 |
8149.151265621185 | 8704.618453979492 | 9153.236508369446 |
11787.146806716919
Sel* Subquery Colombia | 8622.495520114899 | 6959.257125854492 |
8179.068505764008 | 8765.061974525452 | 9159.55775976181 |
10187.61420249939
Sel1 Colombia | 22717.704384326935 | 8666.495084762573 |
22885.42276620865 | 23949.790477752686 | 24966.21882915497 |
30625.644207000732
Sel1 Subquery Colombia | 7529.951772689819 | 6241.269111633301 |
7127.403438091278 | 7577.62348651886 | 7866.843640804291 |
8954.48899269104
;

-- After including transaction_id in country,date index
query x 20 | avg | min |
q1 | median | q3 | max
------------------------+--------------------+--------------------+-------------------+--------------------+--------------------+--------------------
Count Colombia | 10326.94479227066 | 7079.586982727051 |
8091.441631317139 | 10685.971021652222 | 11660.240888595581 |
16219.580888748169
Subquery Colombia | 8345.360279083252 | 6759.0179443359375 |
7150.483548641205 | 7609.055519104004 | 8118.529975414276 |
15819.210052490234
Sel* Colombia | 9401.683914661407 | 8350.785970687866 |
8727.016389369965 | 9171.823978424072 | 9705.730974674225 |
12684.055089950562
Sel* Subquery Colombia | 10874.297595024109 | 7996.103048324585 |
9317.362785339355 | 10767.66049861908 | 12130.92851638794 |
14003.422021865845
Sel1 Colombia | 14704.787838459015 | 7033.560991287231 |
8938.009798526764 | 11308.07101726532 | 21711.08090877533 |
25156.877994537354
Sel1 Subquery Colombia | 7128.487503528595 | 5076.292991638184 |
5678.286790847778 | 6925.720572471619 | 8272.867858409882 |
11430.468082427979

query x 100 | avg | min |
q1 | median | q3 | max
------------------------+--------------------+--------------------+--------------------+--------------------+--------------------+--------------------
Count Colombia | 8165.0702357292175 | 5923.334121704102 |
6800.160050392151 | 7435.7980489730835 | 9075.710475444794 |
13613.409042358398
Subquery Colombia | 7299.517266750336 | 5389.672040939331 |
6362.253367900848 | 6781.42237663269 | 7978.189289569855 |
11542.781829833984
Sel* Colombia | 14157.406282424927 | 8775.223016738892 |
13062.03180551529 | 14233.824968338013 | 15513.144373893738 |
19184.97586250305
Sel* Subquery Colombia | 13438.675961494446 | 10216.159105300903 |
12183.876752853394 | 13196.363925933838 | 14356.310486793518 |
20111.860036849976
Sel1 Colombia | 13753.776743412018 | 7020.914793014526 |
7893.3587074279785 | 9101.168870925903 | 22971.67855501175 |
26913.809061050415
Sel1 Subquery Colombia | 6757.480027675629 | 5529.844045639038 |
6212.466478347778 | 6777.510046958923 | 7212.876975536346 |
8500.235080718994

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message kimaidou 2024-03-06 16:13:24 Re: Separate 100 M spatial data in 100 tables VS one big table
Previous Message Tomas Vondra 2024-03-05 20:13:13 Re: Separate 100 M spatial data in 100 tables VS one big table