From: | Robert Haas <robertmhaas(at)gmail(dot)com> |
---|---|
To: | pgsql-hackers(at)postgresql(dot)org |
Subject: | including backend ID in relpath of temp rels - updated patch |
Date: | 2010-06-10 20:39:50 |
Message-ID: | AANLkTilvFjGC_64S42dHOeRlqV2eAURDjE2u67XBEhyj@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
For previous discussion of this topic, see:
http://archives.postgresql.org/pgsql-hackers/2010-04/msg01181.php
http://archives.postgresql.org/pgsql-hackers/2010-05/msg00352.php
http://archives.postgresql.org/pgsql-hackers/2010-06/msg00302.php
As in the original version of the patch, I have not simply added
backend ID to RelFileNode, because there are too many places using
RelFileNode in contexts where the backend ID can be determined from
context, such as the shared and local buffer managers and the xlog
code. Instead, I have introduced BackendRelFileNode for contexts
where we need both the RelFileNode and the backend ID. The smgr layer
has to use BackendRelFileNode across the board, since it operates on
both permanent and temporary relation, including - potentially -
temporary relations of other backends. smgr invalidations must also
include the backend ID, as must communication between regular backends
and the bgwriter. The relcache now stores rd_backend instead of
rd_islocaltemp so that it remains straightforward to call smgropen()
based on a relcache entry. Some smgr functions no longer require an
isTemp argument, because they can infer the necessary information from
their BackendRelFileNode. smgrwrite() and smgrextend() now take a
skipFsync argument rather than an isTemp argument.
In this version of the patch, I've improved the temporary file cleanup
mechanism. In CVS HEAD, when a transaction commits or aborts, we
write an XLOG record with all relations that must be unlinked,
including temporary relations. With this patch, we no longer include
temporary relations in the XLOG records (which may be a tiny
performance benefit for some people); instead, on every startup of the
database cluster, we just nuke all temporary relation files (which is
now easy to do, since they are named differently than files for
non-temporary relations) at the same time that we nuke other temp
files. This produces slightly different behavior. In CVS HEAD,
temporary files get removed whenever an xlog redo happens, so either
at cluster start or after a backend crash, but only to the extent that
they appear in WAL. With this patch, we can be sure of removing ALL
stray files, which is maybe ever-so-slightly leaky in CVS HEAD. But
on the other hand, because it hooks into the existing temporary file
cleanup code, it only happens at cluster startup, NOT after a backend
crash. The existing coding leaves temporary sort files and similar
around after a backend crash for forensics purposes. That might or
might not be worth rethinking for non-debug builds, but I don't think
there's any very good reason to think that temporary relation files
need to be handled differently than temporary work files.
I believe that this patch will clear away one major obstacle to
implementing global temporary and unlogged tables: it enables us to be
sure of cleaning up properly after a crash without relying on catalog
entries or XLOG. Based on previous discussions, however, I believe
there is support for making this change independently of what becomes
of that project, just for the benefit of having a more robust cleanup
mechanism.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company
Attachment | Content-Type | Size |
---|---|---|
temprelnames-v2.patch | application/octet-stream | 79.7 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Greg Smith | 2010-06-10 21:34:29 | Re: fix use of posix_fadvise in xlog.c |
Previous Message | Greg Smith | 2010-06-10 20:15:44 | Re: Large (almost 50%!) performance drop after upgrading to 8.4.4? |