From: | Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> |
---|---|
To: | Amit Khandekar <amitdkhan(dot)pg(at)gmail(dot)com> |
Cc: | Andres Freund <andres(at)anarazel(dot)de>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, Alvaro Herrera from 2ndQuadrant <alvherre(at)alvh(dot)no-ip(dot)org>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Subject: | Re: logical decoding : exceeded maxAllocatedDescs for .spill files |
Date: | 2019-11-19 08:37:46 |
Message-ID: | CAA4eK1+cey6i6a0zD9kk_eaDXb4RPNZqu4UwXO9LbHAgMpMBkg@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Mon, Nov 18, 2019 at 5:50 PM Amit Khandekar <amitdkhan(dot)pg(at)gmail(dot)com> wrote:
>
> On Mon, 18 Nov 2019 at 17:20, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> > I see that you have made changes in ReorderBufferRestoreChanges to use
> > PathNameOpenFile, but not in ReorderBufferSerializeTXN. Is there a
> > reason for the same? In my test environment, with the test provided
> > by you, I got the error (reported in this thread) via
> > ReorderBufferSerializeTXN.
>
> You didn't get this error with the patch applied, did you ?
>
No, I got this before applying the patch. However, after applying the
patch, I got below error in the same test:
postgres=# SELECT 1 from
pg_logical_slot_get_changes('regression_slot', NULL,NULL) LIMIT 1;
ERROR: could not read from reorderbuffer spill file: Invalid argument
It seems to me that FileRead API used in the patch can return value <
0 on EOF. See the API usage in BufFileLoadBuffer. I got this error
on a windows machine and in the server log the message was "LOG:
unrecognized win32 error code: 38" which indicates "Reached the end of
the file."
> If you were debugging this without the patch applied, I suspect that
> the reason why ReorderBufferSerializeTXN() => OpenTransientFile() is
> generating this error is because the max limit must be already crossed
> because of earlier calls to ReorderBufferRestoreChanges().
>
> Note that in ReorderBufferSerializeTXN(), OpenTransientFile() is
> sufficient because the code in that function has made sure the fd gets
> closed there itself.
>
Okay, then we might not need it there, but we should at least add a
comment in ReorderBufferRestoreChanges to explain why we have used a
different function to operate on the file at that place.
>
> For the API's that use VFDs (like PathNameOpenFile), the files opened
> are always recorded in the VfdCache array. So it is not required to do
> the cleanup at (sub)transaction end, because the kernel fds get closed
> dynamically in ReleaseLruFiles() whenever they reach max_safe_fds
> limit. So if a transaction aborts, the fds might remain open, but
> those will get cleaned up whenever we require more fds, through
> ReleaseLruFiles(). Whereas, for files opened through
> OpenTransientFile(), VfdCache is not involved, so this needs
> transaction end cleanup.
>
Have you tried by injecting some error? After getting the error
mentioned above in email, when I retried the same query, I got the
below message.
postgres=# SELECT 1 from
pg_logical_slot_get_changes('regression_slot', NULL,NULL) LIMIT 1;
ERROR: could not remove file
"pg_replslot/regression_slot/xid-1693-lsn-0-18000000.spill" during
removal of pg_replslot/regression_slot/xid*: Permission denied
And, then I tried to drop the replication slot and I got below error.
postgres=# SELECT * FROM pg_drop_replication_slot('regression_slot');
ERROR: could not rename file "pg_replslot/regression_slot" to
"pg_replslot/regression_slot.tmp": Permission denied
It might be something related to Windows, but you can once try by
injecting some error after reading a few files in the code path and
see the behavior.
--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com
From | Date | Subject | |
---|---|---|---|
Next Message | Pavel Stehule | 2019-11-19 09:16:47 | Re: range_agg |
Previous Message | Thomas Munro | 2019-11-19 07:46:21 | Re: Planner chose a much slower plan in hashjoin, using a large table as the inner table. |