From: | Robert Haas <robertmhaas(at)gmail(dot)com> |
---|---|
To: | Tomas Vondra <tv(at)fuzzy(dot)cz> |
Cc: | pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: autovacuum stress-testing our system |
Date: | 2012-11-21 18:02:56 |
Message-ID: | CA+TgmoZr3xUQq_OM2V1nmEN6NScVBFkpYJrpFFmbyOPJKvZdzQ@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Sun, Nov 18, 2012 at 5:49 PM, Tomas Vondra <tv(at)fuzzy(dot)cz> wrote:
> The two main changes are these:
>
> (1) The stats file is split into a common "db" file, containing all the
> DB Entries, and per-database files with tables/functions. The common
> file is still called "pgstat.stat", the per-db files have the
> database OID appended, so for example "pgstat.stat.12345" etc.
>
> This was a trivial hack pgstat_read_statsfile/pgstat_write_statsfile
> functions, introducing two new functions:
>
> pgstat_read_db_statsfile
> pgstat_write_db_statsfile
>
> that do the trick of reading/writing stat file for one database.
>
> (2) The pgstat_read_statsfile has an additional parameter "onlydbs" that
> says that you don't need table/func stats - just the list of db
> entries. This is used for autovacuum launcher, which does not need
> to read the table/stats (if I'm reading the code in autovacuum.c
> correctly - it seems to be working as expected).
I'm not an expert on the stats system, but this seems like a promising
approach to me.
> (a) It does not solve the "many-schema" scenario at all - that'll need
> a completely new approach I guess :-(
We don't need to solve every problem in the first patch. I've got no
problem kicking this one down the road.
> (b) It does not solve the writing part at all - the current code uses a
> single timestamp (last_statwrite) to decide if a new file needs to
> be written.
>
> That clearly is not enough for multiple files - there should be one
> timestamp for each database/file. I'm thinking about how to solve
> this and how to integrate it with pgstat_send_inquiry etc.
Presumably you need a last_statwrite for each file, in a hash table or
something, and requests need to specify which file is needed.
> And yet another one I'm thinking about is using a fixed-length
> array of timestamps (e.g. 256), indexed by mod(dboid,256). That
> would mean stats for all databases with the same mod(oid,256) would
> be written at the same time. Seems like an over-engineering though.
That seems like an unnecessary kludge.
> (c) I'm a bit worried about the number of files - right now there's one
> for each database and I'm thinking about splitting them by type
> (one for tables, one for functions) which might make it even faster
> for some apps with a lot of stored procedures etc.
>
> But is the large number of files actually a problem? After all,
> we're using one file per relation fork in the "base" directory, so
> this seems like a minor issue.
I don't see why one file per database would be a problem. After all,
we already have on directory per database inside base/. If the user
has so many databases that dirent lookups in a directory of that size
are a problem, they're already hosed, and this will probably still
work out to a net win.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
From | Date | Subject | |
---|---|---|---|
Next Message | Andres Freund | 2012-11-21 18:04:54 | Re: [PATCH] binary heap implementation |
Previous Message | Robert Haas | 2012-11-21 17:54:30 | Re: [PATCH] binary heap implementation |