From: | Andres Freund <andres(at)2ndquadrant(dot)com> |
---|---|
To: | Tomas Vondra <tv(at)fuzzy(dot)cz> |
Cc: | pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: PATCH: optimized DROP of multiple tables within a transaction |
Date: | 2012-12-20 11:33:00 |
Message-ID: | 20121220113300.GB4303@awork2.anarazel.de |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On 2012-12-20 02:54:48 +0100, Tomas Vondra wrote:
> On 19.12.2012 02:18, Andres Freund wrote:
> > On 2012-12-17 00:31:00 +0100, Tomas Vondra wrote:
> >
> > I think except of the temp buffer issue mentioned below its ready.
> >
> >> -DropRelFileNodeAllBuffers(RelFileNodeBackend rnode)
> >> +DropRelFileNodeAllBuffers(RelFileNodeBackend * rnodes, int nnodes)
> >> {
> >> - int i;
> >> + int i, j;
> >> +
> >> + /* sort the list of rnodes */
> >> + pg_qsort(rnodes, nnodes, sizeof(RelFileNodeBackend), rnode_comparator);
> >>
> >> /* If it's a local relation, it's localbuf.c's problem. */
> >> - if (RelFileNodeBackendIsTemp(rnode))
> >> + for (i = 0; i < nnodes; i++)
> >> {
> >> - if (rnode.backend == MyBackendId)
> >> - DropRelFileNodeAllLocalBuffers(rnode.node);
> >> - return;
> >> + if (RelFileNodeBackendIsTemp(rnodes[i]))
> >> + {
> >> + if (rnodes[i].backend == MyBackendId)
> >> + DropRelFileNodeAllLocalBuffers(rnodes[i].node);
> >> + }
> >> }
> >
> > While you deal with local buffers here you don't anymore in the big loop
> > over shared buffers. That wasn't needed earlier since we just returned
> > after noticing we have local relation, but thats not the case anymore.
>
> Hmm, but that would require us to handle the temp relations explicitly
> wherever we call DropRelFileNodeAllBuffers. Currently there are two such
> places - smgrdounlink() and smgrdounlinkall().
>
> By placing it into DropRelFileNodeAllBuffers() this code is shared and I
> think it's a good thing.
>
> But that does not mean the code is perfect - it was based on the
> assumption that if there's a mix of temp and regular relations, the temp
> relations will be handled in the first part and the rest in the second one.
>
> Maybe it'd be better to improve it so that the temp relations are
> removed from the array after the first part (and skip the second one if
> there are no remaining relations).
The problem is that afaik without the backend-local part a temporary
relation could match the same relfilenode as a "full" relation which
would have pretty bad consequences.
> > Still surprised this is supposed to be faster than a memcmp, but as you
> > seem to have measured it earlier..
>
> It surprised me too. These are the numbers with the current patch:
>
> 1) one by one
> =============
> 0 2 4 6 8 10 12 14 16 18 20
> --------------------------------------------------------------
> current 15 22 28 34 41 75 77 82 92 99 106
> memcmp 16 23 29 36 44 122 125 128 153 154 158
>
> Until the number of indexes reaches ~10, the numbers are almost exactly
> the same. Then the bsearch branch kicks in and it's clear how much
> slower the memcmp comparator is.
>
> 2) batches of 100
> =================
> 0 2 4 6 8 10 12 14 16 18 20
> --------------------------------------------------------------
> current 3 5 8 10 12 15 17 21 23 27 31
> memcmp 4 7 10 13 16 19 22 28 30 32 36
>
> Here the difference is much smaller, but even here the memcmp is
> consistently a bit slower.
>
>
> My theory is that while the current comparator starts with the most
> variable part (relation OID), and thus usually starts after the
> comparing the first 4B, the memcmp starts from the other end (tablespace
> and database) and therefore needs to compare more data.
Thats a good guess. I think its ok this way, but if you feel like
playing you could just change the order in the struct...
> >> +void smgrdounlinkall(SMgrRelation * rels, int nrels, bool isRedo)
> >> +{
> >> + int i = 0;
> >> + RelFileNodeBackend *rnodes;
> >> + ForkNumber forknum;
> >> +
> >> + /* initialize an array which contains all relations to be dropped */
> >> + rnodes = palloc(sizeof(RelFileNodeBackend) * nrels);
> >> + for (i = 0; i < nrels; i++)
> >> + {
> >> + RelFileNodeBackend rnode = rels[i]->smgr_rnode;
> >> + int which = rels[i]->smgr_which;
> >> +
> >> + rnodes[i] = rnode;
> >> +
> >> + /* Close the forks at smgr level */
> >> + for (forknum = 0; forknum <= MAX_FORKNUM; forknum++)
> >> + (*(smgrsw[which].smgr_close)) (rels[i], forknum);
> >> + }
> >> +
> >> + /*
> >> + * Get rid of any remaining buffers for the relation. bufmgr will just
> >> + * drop them without bothering to write the contents.
> >> + */
> >> + DropRelFileNodeAllBuffers(rnodes, nrels);
> >
> > I think it might be a good idea to handle temp relations on/buffers on
> > this level instead of trying to put it into
> > DropRelFileNodeAllBuffers. Would also allow you to only build an array
> > of RelFileNode's instead of RelFileNodeBackend's which might make it
> > slightl faster.
>
> Hmmm, sadly that'd require duplication of code because it needs to be
> done in smgrdounlink too.
It should be fairly small though.
int nrels_nonlocal = 0;
for (i = 0; i < nrels; i++)
{
RelFileNodeBackend rnode = rels[i]->smgr_rnode;
int which = rels[i]->smgr_which;
if (RelFileNodeBackendIsTemp(rnode))
DropRelFileNodeAllLocalBuffers
else
rnodes[nrels_nonlocal++];
for (forknum = 0; forknum <= MAX_FORKNUM; forknum++)
(*(smgrsw[which].smgr_close)) (rels[i], forknum);
}
DropRelFileNodeAllLocalBuffers(rnode, nrels_nonlocal);
In smgrdounlink it should only be a
if (RelFileNodeBackendIsTemp(rnode))
DropRelFileNodeAllLocalBuffers(rnode)
else
DropRelFileNodeAllBuffers(rnode);
ISTM that would be cleaner then memmove'ing the rnode array to remove
the temp rels away.
> >
> >> + /*
> >> + * It'd be nice to tell the stats collector to forget it immediately, too.
> >> + * But we can't because we don't know the OID (and in cases involving
> >> + * relfilenode swaps, it's not always clear which table OID to forget,
> >> + * anyway).
> >> + */
> >
> > This and at least one other comment seems to be in too many locations
> > now.
>
> I see it in three places - smgrdounlink, smgrdounlinkall and
> smgrdounlinkfork. Which ones you consider superfluous? I think it's
> appropriate to keep them in all three places.
I only read the patch, not the result, so maybe youre right. I'll look
at it sometime.
Greetings,
Andres Freund
--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
From | Date | Subject | |
---|---|---|---|
Next Message | Bruce Momjian | 2012-12-20 11:37:46 | Re: [GENERAL] trouble with pg_upgrade 9.0 -> 9.1 |
Previous Message | Groshev Andrey | 2012-12-20 11:19:17 | Re: [GENERAL] trouble with pg_upgrade 9.0 -> 9.1 |