From: | Artur Zając CFI <azajac(at)cfi(dot)pl> |
---|---|
To: | <pgsql-performance(at)postgresql(dot)org> |
Subject: | Planner estimates and VACUUM/autovacuum |
Date: | 2014-02-03 18:57:45 |
Message-ID: | 033801cf2111$d3518750$79f495f0$@cfi.pl |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-performance |
Hi,
I am not sure if it is bug or not but I found some strange behaviour. Maybe
it is the same as described on
http://www.postgresql.org/message-id/14616.1244317490@sss.pgh.pa.us ?). If
yes - I'm sorry for the trouble, but I think that my example is more
obvious.
Tested on PostgreSQL 9.2.4 and 9.2.6.
Console 1:
BEGIN;
DECLARE a CURSOR FOR SELECT * FROM tab;
--- Keep cursor open for disallow full vacuum of tab
Console 2:
SELECT count(*) FROM tab;
---- Result: 3588;
select reltuples from pg_class where relname='table';
--- Result: 3588
UPDATE tab SET id=id;
UPDATE tab SET id=id;
UPDATE tab SET id=id;
VACUUM ANALYZE tab;
select reltuples from pg_class where relname='table';
--- Result: 3588
Now wait few seconds :)
select reltuples from pg_class where relname='table';
--- Result: 12560
VACUUM ANALYZE tab;
select reltuples from pg_class where relname='table';
--- Result: 3588
There is 3588 live records and 12560 live+dead records in table.
That is strange for me. VACUUM updates pg_class.reltuples differently (only
live roiws count) than autovacuum (live and dead rows). Why?
Also in planning:
explain SELECT id FROM tab;
QUERY PLAN
----------------------------------------------------------------------
Seq Scan on tab (cost=0.00..1074.60 rows=12560 width=4)
Estimation is done with the use of current pg_class.reltuples value. This
value includes dead rows count after autovacuum so estimation is bad,
especially in more complex planner tree, for example:
Explain SELECT a.id FROM tab AS a JOIN tab AS b USING (id);
QUERY PLAN
----------------------------------------------------------------------------
-----------------------------
Nested Loop (cost=0.00..6410.70 rows=12560 width=4)
-> Seq Scan on tab a (cost=0.00..1074.60 rows=12560 width=8)
-> Index Only Scan using tab_pkey on tab b (cost=0.00..0.41 rows=1
width=4)
Index Cond: (id = a.id)
PostgreSQL estimates 12560 records in query result. This is wrong estimation
if dead tuples are removed during seq scan or index scan (I suppose that it
is).
I don't think that AUTOVACUUM and VACUUM ANALYZE should behave differently
:(
--------------------------------------------------------------------------
Artur Zajac
From | Date | Subject | |
---|---|---|---|
Next Message | Artur Zając | 2014-02-03 19:29:34 | Planner estimates and VACUUM/autovacuum |
Previous Message | Robert Haas | 2014-02-03 16:55:34 | Re: [PERFORM] encouraging index-only scans |