From: | Michael Paquier <michael(dot)paquier(at)gmail(dot)com> |
---|---|
To: | Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> |
Cc: | Ashutosh Sharma <ashu(dot)coek88(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: hash index on unlogged tables doesn't behave as expected |
Date: | 2017-07-04 07:53:51 |
Message-ID: | CAB7nPqR-2GMDeEEVa7Sh5xNsOZ_SjnsWkxYLfwqBVhqnLO6v=A@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Mon, Jul 3, 2017 at 7:40 PM, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> There is already such a test, see create_index.sql.
> CREATE UNLOGGED TABLE unlogged_hash_table (id int4);
> CREATE INDEX unlogged_hash_index ON unlogged_hash_table USING hash (id
> int4_ops);
>
> Do you have something else in mind?
>
> I think the problem mentioned in this thread can be caught only if we
> promote the standby and restart it. We might be able to write it
> using TAP framework, but I think such a test should exist for other
> index types as well or in general for unlogged tables. I am not
> opposed to writing such a test if you and or others feel so.
There have been many requests for something like that. This overlaps a
lot with the existing make check though, so a buildfarm module using
wal_consistency_checking may be a better idea than something in core.
>> Apart from that i have tested the patch and the patch seems to be working
>> fine.
>
> Thanks.
It seems to me that this qualifies as an open item for 10. WAL-logging
is new for hash tables. Amit, could you add one?
+ use_wal = RelationNeedsWAL(rel) ||
+ (rel->rd_rel->relpersistence == RELPERSISTENCE_UNLOGGED &&
+ forkNum == INIT_FORKNUM);
In access AMs, empty() routines are always only called for unlogged
relations, so you could relax that to check for INIT_FORKNUM only.
Looking at the patch, I think that you are right to do the logging
directly in _hash_init() and not separately in hashbuildempty().
Compared to other relations the init data is more dynamic.
+ /*
+ * Force the on-disk state of init forks to always be in sync with the
+ * state in shared buffers. See XLogReadBufferForRedoExtended.
+ */
+ XLogRecGetBlockTag(record, 0, NULL, &forknum, NULL);
+ if (forknum == INIT_FORKNUM)
+ FlushOneBuffer(metabuf);
I find those forced syncs a bit surprising. The buffer is already
marked as dirty, so even if there is a concurrent checkpoint they
would be flushed. If recovery comes again here, then they would just
be replayed. I am unclear why XLogReadBufferForRedoExtended is not
enough to replay a FPW of that.
--
Michael
From | Date | Subject | |
---|---|---|---|
Next Message | Etsuro Fujita | 2017-07-04 08:25:17 | Re: UPDATE of partition key |
Previous Message | Dean Rasheed | 2017-07-04 07:49:05 | Re: Multi column range partition table |