From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | Robert Haas <robertmhaas(at)gmail(dot)com> |
Cc: | Jeff Janes <jeff(dot)janes(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Daniel Farina <daniel(at)heroku(dot)com>, Craig Ringer <ringerc(at)ringerc(dot)id(dot)au>, Harold A(dot) Giménez <harold(dot)gimenez(at)gmail(dot)com> |
Subject: | Re: [PERFORM] DELETE vs TRUNCATE explanation |
Date: | 2012-07-19 14:09:26 |
Message-ID: | 20408.1342706966@sss.pgh.pa.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers pgsql-performance |
Robert Haas <robertmhaas(at)gmail(dot)com> writes:
> Seems a bit complex, but it might be worth it. Keep in mind that I
> eventually want to be able to make an unlogged table logged or a visca
> versa, which will probably entail unlinking just the init fork (for
> the logged -> unlogged direction).
Well, as far as that goes, I don't see a reason why you couldn't unlink
the init fork immediately on commit. The checkpointer should not have
to be involved at all --- there's no reason to send it a FORGET FSYNC
request either, because there shouldn't be any outstanding writes
against an init fork, no?
But having said that, this does serve as an example that we might
someday want the flexibility to kill individual forks. I was
intending to kill smgrdounlinkfork altogether, but I'll refrain.
> I think this is just over-engineered. The originally complained-of
> problem was all about the inefficiency of manipulating the
> checkpointer's backend-private data structures, right? I don't see
> any particular need to mess with the shared memory data structures at
> all. If you wanted to add some de-duping logic to retail fsync
> requests, you could probably accomplish that more cheaply by having
> each such request look at the last half-dozen or so items in the queue
> and skip inserting the new request if any of them match the new
> request. But I think that'd probably be a net loss, because it would
> mean holding the lock for longer.
What about checking just the immediately previous entry? This would
at least fix the problem for bulk-load situations, and the cost ought
to be about negligible compared to acquiring the LWLock.
I have also been wondering about de-duping on the backend side, but
the problem is that if a backend remembers its last few requests,
it doesn't know when that cache has to be cleared because of a new
checkpoint cycle starting. We could advertise the current cycle
number in shared memory, but you'd still need to take a lock to
read it. (If we had memory fence primitives it could be a bit
cheaper, but I dunno how much.)
regards, tom lane
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2012-07-19 14:12:19 | Re: bgwriter, regression tests, and default shared_buffers settings |
Previous Message | Andrew Dunstan | 2012-07-19 13:54:09 | Re: isolation check takes a long time |
From | Date | Subject | |
---|---|---|---|
Next Message | Felix Scheicher | 2012-07-19 15:13:14 | Re: queries are fast after dump->restore but slow again after some days dispite vacuum |
Previous Message | Robert Haas | 2012-07-19 12:56:51 | Re: [PERFORM] DELETE vs TRUNCATE explanation |