From: | Andres Freund <andres(at)2ndquadrant(dot)com> |
---|---|
To: | Heikki Linnakangas <hlinnakangas(at)vmware(dot)com> |
Cc: | pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: Add database to PGXACT / per database vacuuming |
Date: | 2013-08-30 18:29:16 |
Message-ID: | 20130830182916.GA6794@awork2.anarazel.de |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On 2013-08-30 21:07:04 +0300, Heikki Linnakangas wrote:
> On 30.08.2013 19:01, Andres Freund wrote:
> >For the logical decoding patch I added support for pegging
> >RecentGlobalXmin (and GetOldestXmin) to a lower value. To avoid causing
> >undue bloat& cpu overhead (hot pruning is friggin expensive) I split
> >RecentGlobalXmin into RecentGlobalXmin and RecentGlobalDataXmin where
> >the latter is the the xmin horizon used for non-shared, non-catalog
> >tables. That removed almost all overhead I could measure.
> >
> >During that I was tinkering with the idea of reusing that split to
> >vacuum/prune user tables in a per db fashion. In a very quick and hacky
> >test that sped up the aggregate performance of concurrent pgbenches in
> >different databases by about 30%. So, somewhat worthwile ;).
> >
> >The problem with that is that GetSnapshotData, which computes
> >RecentGlobalXmin, only looks at the PGXACT structures and not PGPROC
> >which contains the database oid. This is a recently added optimization
> >which made GetSnapshotData() quite a bit faster& scalable which is
> >important given the frequency it's called at.
>
> Hmm, so you're creating a version of GetSnapshotData() that only takes into
> account backends in the same backend?
You can see what I did for logical decoding in http://git.postgresql.org/gitweb/?p=users/andresfreund/postgres.git;a=blobdiff;f=src/backend/storage/ipc/procarray.c;h=11aa1f5a71196a61e31b711e0a044b2a5927a6cc;hp=9bf0989c9206b5e07053587f517d5e9a2322a628;hb=edcf0939072ebe68969560a7d54a26c123b279b4;hpb=ff4fa81665798642719c11c779d0518ef6611373
So, basically I compute the normal RecentGlobalXmin, and then just
subtract the "logical xmin" which is computed elsewhere to get the
catalog xmin.
What I'd done with the prototype of $topic (lost it, but I am going to
hack it together again) was just to compute RecentGlobalXmin (for
non-catalog, non-shared tables) at the same time with
RecentGlobalDataXmin (for everything else) by just not lowering
RecentGlobalDataXmin if pgxact->dboid != MyDatabaseId.
So, the snapshot itself was the same, but because RecentGlobalDataXmin
is independent from the other databases vacuum & pruning can cleanup way
more leading to a smaller database and higher database.
> >Currently a single PGXACT is 12 bytes which means we a) have several
> >entries in a single cacheline b) have ugly sharing because we will have
> >PGXACTs split over more than one cacheline.
>
> I can't get excited about either of these arguments, though. The reason for
> having separate PGXACT structs is that they are as small as possible, so
> that you can fit as many of them as possible in as few cache lines as
> possible. Whether one PGXACT crosses a cache line or not is not important,
> because when taking a snapshot, you scan through all of them.
The problem with that is that we actually write to PGXACT pretty
frequently (at least ->xid, ->xmin, ->nxids, ->delayChkpt). As soon as
you factor that in, sharing cachelines between backends can hurt. Even
a plain GetSnapshotData() will write to MyPgXact->xmin...
> I don't know how big an impact adding the database oid would have, on the
> case that the PGPROC/PGXACT split was done in the first place. In the worst
> case it will make taking a snapshot 1/3 slower under contention. That needs
> to be tested.
Yes, definitely. I am basically wondering whether somebody has/sees
fundamental probles with it making it pointless to investigate.
> One idea is to have a separate PGXACT array for each database? Well, that
> might be difficult, but something similar, like group all PGXACTs for one
> database together, and keep a separate lookup array for where the entries
> for each database begins.
Given that we will have to search all PGXACT entries anyway because of
shared relations for the forseeable future, I can't see that being
really beneficial.
Greetings,
Andres Freund
--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
From | Date | Subject | |
---|---|---|---|
Next Message | Andres Freund | 2013-08-30 18:34:37 | Re: Freezing without write I/O |
Previous Message | Heikki Linnakangas | 2013-08-30 18:15:15 | Re: Add database to PGXACT / per database vacuuming |