Re: huge pgstat.stat file on PostgreSQL 8.3.24

From: "Huang, Suya" <Suya(dot)Huang(at)au(dot)experian(dot)com>
To: Tomas Vondra <tv(at)fuzzy(dot)cz>, Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
Cc: "pgsql-performance(at)postgresql(dot)org" <pgsql-performance(at)postgresql(dot)org>
Subject: Re: huge pgstat.stat file on PostgreSQL 8.3.24
Date: 2014-06-24 07:07:37
Message-ID: D83E55F5F4D99B4A9B4C4E259E6227CD014AF1AD@AUX1EXC01.apac.experian.local
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

-----Original Message-----
From: Tomas Vondra [mailto:tv(at)fuzzy(dot)cz]
Sent: Friday, June 20, 2014 8:14 PM
To: Pavel Stehule
Cc: Huang, Suya; pgsql-performance(at)postgresql(dot)org
Subject: Re: [PERFORM] huge pgstat.stat file on PostgreSQL 8.3.24

On 20 Červen 2014, 5:33, Pavel Stehule wrote:
> 2014-06-20 1:44 GMT+02:00 Huang, Suya <Suya(dot)Huang(at)au(dot)experian(dot)com>:
>>
>> Thanks Pavel, to be more clear, what does " pg_stat_reset "really reset?
>> In the document it says " Reset all statistics counters for the
>> current database to zero(requires superuser privileges) ". I thought
>> it would reset all statistics of all tables/indexes, thus why I am
>> thinking of re-run analyze database to gather statistics. Because if
>> table/indexes don't have statistics, the query plan would be affected
>> which is not a good thing to a production box... I'm not so sure if I
>> understand "run statistics" you mentioned here.
>>
>
> you have true - anyway you can clean a content of this directory - but
> if your database has lot of database objects, your stat file will have
> a original size very early
>
> Pavel
>

No, he's not right.

Suya, as I wrote in my previous message, there are two kinds of statistics in PostgreSQL

a) data distribution statistics
- histograms, MCV lists, number of distinct values, ...
- stored in regular tables
- used for planning
- collected by ANALYZE
- not influenced by pg_stat_reset() at all

b) runtime statistics
- number of scans for table/index, rows fetched from table/index, ...
- tracks activity within the database
- stored in pgstat.stat file (or per-db files in the recent releases)
- used for monitoring, not for planning
- removed by pg_stat_reset()

So running pg_stat_reset will not hurt planning at all.

regards
Tomas

Hi Tomas,

You're right, my DB version is 8.3.11, I remembered the wrong version... we've got a new project using the latest version 9.3.4, and the old DB will be decommissioned in the future, so that's why the management people don't want to spend resources on upgrading and QA, etc.

Still have a question of why the file would become so big, is that related to the number of objects I have in database?

Thanks again for your clear explanation on the two different statistics in PostgreSQL DB, really helped a lot! I'm wondering if they should also exist in the documentation, as it really confuses people... :)

Thanks,
Suya

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message Brio 2014-06-25 00:13:25 Re: postgres files in use not staying in linux file cache
Previous Message tim_wilson 2014-06-24 00:39:47 Re: autovacuum vacuum creates bad statistics for planner when it log index scans: 0