From: | Gordon Shannon <gordo169(at)gmail(dot)com> |
---|---|
To: | pgsql-hackers(at)postgresql(dot)org |
Subject: | Surprising dead_tuple_count from pgstattuple |
Date: | 2010-08-06 18:31:10 |
Message-ID: | 1281119470961-2266955.post@n5.nabble.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
This is an expansion of the question I posed in this thread:
I am framing the question here in relation to pgstattuple. Running 8.4.4 on
Centos.
I have a table T with 5,063,463 rows. It was just restored from a backup,
and there is no other activity in this database. I ran a vacuum.
pg_stat_user_tables.n_dead_tup (which is really
pg_stat_get_dead_tuples('T'::regclass::oid)) says 0
pgstattuple says dead_tuple_count=0, free_space=1,355,152
1. I delete 10,000 rows.
pg_stat_user_tables.n_live_tup -> 5053463
pg_stat_user_tables.n_dead_tup -> 10000
pgstattuple.dead_tuple_count -> 10000
pgstattuple.free_space -> 1355152
So far, so good. pgstattuple is counting the dead tuples, and not including
those tuples in the free space count.
2. I delete 15,000 more rows.
pg_stat_user_tables.n_live_tup -> 5038463
pg_stat_user_tables.n_dead_tup -> 25000
pgstattuple.dead_tuple_count -> 15000 ??
pgstattuple.free_space -> 1996904 ??
pgstattuple now appears to count the earlier 10K deleted tuples as no longer
dead, but free space.
3. I delete 50,000 more rows.
pg_stat_user_tables.n_live_tup -> 4988463
pg_stat_user_tables.n_dead_tup -> 75000
pgstattuple.dead_tuple_count -> 50022 ??
pgstattuple.free_space -> 2966628 ??
Same thing, pgstattuple appears to "see" only the most recent delete
transaction (but off by 22), and count the prior ones as free.
4. vacuum verbose
vacuum verbose t;
INFO: vacuuming "public.t"
INFO: scanned index "t_pkey" to remove 75000 row versions
DETAIL: CPU 0.01s/0.38u sec elapsed 0.40 sec.
INFO: "t": removed 75000 row versions in 637 pages
DETAIL: CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO: index "t_pkey" now contains 4988463 row versions in 13886 pages
DETAIL: 75000 index row versions were removed.
204 index pages have been deleted, 0 are currently reusable.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO: "t": found 50022 removable, 3696 nonremovable row versions in 668 out
of 51958 pages
DETAIL: 0 dead row versions cannot be removed yet.
There were 0 unused item pointers.
0 pages are entirely empty.
CPU 0.01s/0.39u sec elapsed 0.40 sec.
VACUUM
Time: 482.771 ms
It seems relevant that vacuum reports the same incorrect number -- 50022 --
as part of its output. That makes me think that pgstattuple may be using
similar logic to get its dead tuple count.
I wonder if the key to this is that pgstattuple uses
HeapTupleSatisfiesVisibility() to test for deadness. If so, why would this
call return apparently false positives?
I know that pgstattuple is meant to be used for debugging only. I have found
pgstatindex to be very helpful in identifying bloat in my indexes.
Per Tom in the other thread, I now understand that the "found 50022
removable, 3696 nonremovable...." line is referring to the subset of pages
that it scanned looking for dead tuples.
I keep coming back to this, though -- 50,022 seems to be just wrong, or
perhaps simply misleading -- i.e. way too low.
It's present in the output of vacuum, and the output of pgstattuple.
I'd like to understand what meaning this number has, and, ideally, how I can
use to to detect things like bloat or fragmentation.
Thanks!
--
View this message in context: http://postgresql.1045698.n5.nabble.com/Surprising-dead-tuple-count-from-pgstattuple-tp2266955p2266955.html
Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2010-08-06 18:32:33 | Re: Initial review of xslt with no limits patch |
Previous Message | Robert Haas | 2010-08-06 18:30:24 | Re: including backend ID in relpath of temp rels - updated patch |