From: | Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> |
---|---|
To: | Simon Riggs <simon(at)2ndquadrant(dot)com> |
Cc: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: drop/truncate table sucks for large values of shared buffers |
Date: | 2015-07-02 00:58:20 |
Message-ID: | CAA4eK1LxXh_SHZzKSW9pv8MnhLFT4dHP-Oh-tuKgzwWc8RrBZw@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Wed, Jul 1, 2015 at 8:26 PM, Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:
>
> On 1 July 2015 at 15:39, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
>
>>
>> Okay. I think we can maintain the list in similar way as we do for
>> UNLINK_RELATION_REQUEST in RememberFsyncRequest(), but
>> why to wait till 64 tables?
>
>
> I meant once per checkpoint cycle OR every N tables, whichever is sooner.
N would need to vary according to size of Nbuffers.
>
That sounds sensible to me, see my reply below what I think we can do
for this to work.
>>
>> We already scan whole buffer list in each
>> checkpoint cycle, so during that scan we can refer this dropped relation
>> list and avoid syncing such buffer contents. Also for ENOENT error
>> handling for FileWrite, we can use this list to refer relations for
which we
>> need to ignore the error. I think we are already doing something
similar in
>> mdsync to avoid the problem of Dropped tables, so it seems okay to
>> have it in mdwrite as well.
>>
>> The crucial thing in this idea to think about is avoiding reassignment of
>> relfilenode (due to wrapped OID's) before we have ensured that none of
>> the buffers contains tag for that relfilenode. Currently we avoid this
for
>> Fsync case by retaining the first segment of relation (which will avoid
>> reassignment of relfilenode) till checkpoint ends, I think if we just
postpone
>> it till we have validated it in shared_buffers, then we can avoid this
problem
>> in new scheme and it should be delay of maximum one checkpoint cycle
>> for unlinking such file assuming we refer dropped relation list in each
checkpoint
>> cycle during buffer scan.
>
>
> Yes
>
> So you are keeping more data around for longer, right?
Yes and we already do it for the sake of Fsyncs.
> I think we would need some way to trigger a scan when the amount of
deferred dropped data files hits a certain size.
>
Okay, I think we can keep it as if the number of dropped
relations reached 64 or 0.01% (or 0.1%) of shared_buffers
whichever is minimum as a point to trigger checkpoint.
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com
From | Date | Subject | |
---|---|---|---|
Next Message | Alex Hunsaker | 2015-07-02 01:34:30 | Re: error message diff with Perl 5.22.0 |
Previous Message | Gurjeet Singh | 2015-07-01 23:11:16 | Re: More logging for autovacuum |