From: | Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com> |
---|---|
To: | Simon Riggs <simon(at)2ndQuadrant(dot)com> |
Cc: | PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: COPY with hints, rebirth |
Date: | 2012-02-26 19:16:46 |
Message-ID: | 4F4A851E.3080501@enterprisedb.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On 24.02.2012 22:55, Simon Riggs wrote:
> A long time ago, in a galaxy far away, we discussed ways to speed up
> data loads/COPY.
> http://archives.postgresql.org/pgsql-hackers/2007-01/msg00470.php
>
> In particular, the idea that we could mark tuples as committed while
> we are still loading them, to avoid negative behaviour for the first
> reader.
>
> Simple patch to implement this is attached, together with test case.
>
> ...
>
> What exactly does it do? Previously, we optimised COPY when it was
> loading data into a newly created table or a freshly truncated table.
> This patch extends that and actually sets the tuple header flag as
> HEAP_XMIN_COMMITTED during the load. Doing so is simple 2 lines of
> code. The patch also adds some tests for corner cases that would make
> that action break MVCC - though those cases are minor and typical data
> loads will benefit fully from this.
This doesn't work with subtransactions:
postgres=# create table a as select 1 as id;
SELECT 1
postgres=# copy a to '/tmp/a';
COPY 1
postgres=# begin;
BEGIN
postgres=# truncate a;
TRUNCATE TABLE
postgres=# savepoint sp1;
SAVEPOINT
postgres=# copy a from '/tmp/a';
COPY 1
postgres=# select * from a;
id
----
(0 rows)
The query should return the row copied in the same subtransaction.
> In the link above, Tom suggested reworking HeapTupleSatisfiesMVCC()
> and adding current xid to snapshots. That is an invasive change that I
> would wish to avoid at any time and explains the long delay in
> tackling this. The way I've implemented it, is just as a short test
> during XidInMVCCSnapshot() so that we trap the case when the xid ==
> xmax and so would appear to be running. This is much less invasive and
> just as performant as Tom's original suggestion.
TransactionIdIsCurrentTransactionId() can be fairly expensive if you
have a lot of subtransactions open...
--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com
From | Date | Subject | |
---|---|---|---|
Next Message | Magnus Hagander | 2012-02-26 19:24:08 | Checkpointer vs pg_stat_bgwriter |
Previous Message | Kevin Grittner | 2012-02-26 16:06:54 | Re: How to know a table has been modified? |