From: | Pavan Deolasee <pavan(dot)deolasee(at)gmail(dot)com> |
---|---|
To: | Jaime Casanova <jaime(dot)casanova(at)2ndquadrant(dot)com> |
Cc: | Haribabu Kommi <kommi(dot)haribabu(at)gmail(dot)com>, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Patch: Write Amplification Reduction Method (WARM) |
Date: | 2016-12-27 13:21:48 |
Message-ID: | CABOikdOvTLp5qYtoN+svfdHAcSR2x=yt_JcoohQhdv6pmnkWrA@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Mon, Dec 26, 2016 at 11:49 AM, Jaime Casanova <
jaime(dot)casanova(at)2ndquadrant(dot)com> wrote:
> On 2 December 2016 at 07:36, Pavan Deolasee <pavan(dot)deolasee(at)gmail(dot)com>
> wrote:
> >
> > I've updated the patches after fixing the issue. Multiple rounds of
> > regression passes for me without any issue. Please let me know if it
> works
> > for you.
> >
>
> Hi Pavan,
>
> Today i was playing with your patch and running some tests and found
> some problems i wanted to report before i forget them ;)
>
>
Thanks Jaime for the tests and bug reports. I'm attaching an add-on patch
which fixes these issues for me. I'm deliberately not sending a fresh
revision because the changes are still minor.
> * You need to add a prototype in src/backend/utils/adt/pgstatfuncs.c:
> extern Datum pg_stat_get_tuples_warm_updated(PG_FUNCTION_ARGS);
>
Added.
>
> * The isolation test for partial_index fails (attached the
> regression.diffs)
>
Fixed. Looks like I forgot to include attributes from predicates and
expressions in the list of index attributes (as pointed by Alvaro)
>
> * running a home-made test i have at hand i got this assertion:
> """
> TRAP: FailedAssertion("!(buf_state & (1U << 24))", File: "bufmgr.c", Line:
> 837)
> LOG: server process (PID 18986) was terminated by signal 6: Aborted
> """
> To reproduce:
> 1) run prepare_test.sql
> 2) then run the following pgbench command (sql scripts attached):
> pgbench -c 24 -j 24 -T 600 -n -f inserts(dot)sql(at)15 -f updates_1(dot)sql(at)20 -f
> updates_2(dot)sql(at)20 -f deletes(dot)sql(at)45 db_test
>
>
Looks like the patch was failing to set the block number correctly in the
t_ctid field, leading to these strange failures. There was also couple of
instances where the t_ctid field was being accessed directly, instead of
the newly added macro. I think we need some better mechanism to ensure that
we don't miss out on such things. But I don't have a very good idea about
doing that right now.
>
> * sometimes when i have made the server crash the attempt to recovery
> fails with this assertion:
> """
> LOG: database system was not properly shut down; automatic recovery in
> progress
> LOG: redo starts at 0/157F970
> TRAP: FailedAssertion("!(!warm_update)", File: "heapam.c", Line: 8924)
> LOG: startup process (PID 14031) was terminated by signal 6: Aborted
> LOG: aborting startup due to startup process failure
> """
> still cannot reproduce this one consistently but happens often enough
>
>
This could be a case of uninitialised variable in log_heap_update(). What
surprises me though that none of the compilers I tried so far could catch
that. In the following code snippet, if the condition evaluates to false
then "warm_update" may remain uninitialised, leading to wrong xlog entry,
which may later result in assertion failure during redo recovery.
7845
7846 if (HeapTupleIsHeapWarmTuple(newtup))
7847 warm_update = true;
7848
Thanks,
Pavan
--
Pavan Deolasee http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
Attachment | Content-Type | Size |
---|---|---|
0003_warm_fixes_v6.patch | application/octet-stream | 3.7 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Ashutosh Bapat | 2016-12-27 13:24:56 | Re: ALTER TABLE parent SET WITHOUT OIDS and the oid column |
Previous Message | Aleksander Alekseev | 2016-12-27 13:15:41 | Re: PATCH: recursive json_populate_record() |