Re: postgres 9.5 DB corruption

From: Thomas Tignor <tptignor(at)yahoo(dot)com>
To: "pgsql-general(at)lists(dot)postgresql(dot)org" <pgsql-general(at)lists(dot)postgresql(dot)org>, Adrian Klaver <adrian(dot)klaver(at)aklaver(dot)com>
Subject: Re: postgres 9.5 DB corruption
Date: 2019-07-25 17:24:28
Message-ID: 394260845.571649.1564075468642@mail.yahoo.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Hi Adrian,Thanks for responding. Below is the schema data for the tables where we always see corruption. You'll notice they have triggers for a postgres extension called Slony-I which provides replication service. It's not clear if/how that's a factor, though.

ams=# \d ams.alert_instance

                    Table "ams.alert_instance"

       Column        |              Type              | Modifiers 

---------------------+--------------------------------+-----------

 alert_instance_id   | integer                        | not null

 alert_definition_id | integer                        | not null

 alert_instance_key  | character varying(500)         | not null

 start_active_date   | timestamp(0) without time zone | not null

 stop_active_date    | timestamp(0) without time zone | 

 active              | smallint                       | not null

 acknowledged        | smallint                       | not null

 ack_clear_time      | timestamp(0) without time zone | 

 user_set_clear_time | smallint                       | 

 category_id         | integer                        | not null

 condition_start     | timestamp(0) without time zone | not null

 unack_reason        | character varying(1)           | 

 viewer_visible      | smallint                       | not null

Indexes:

    "pk_alert_instance" PRIMARY KEY, btree (alert_instance_id), tablespace "tbls5"

    "idx_alert_inst_1" btree (alert_instance_key, alert_definition_id, alert_instance_id, active, acknowledged, ack_clear_time), tablespace "tbls5"

    "idx_alert_inst_cat_id" btree (category_id), tablespace "tbls5"

    "idx_alert_inst_def_id" btree (alert_definition_id), tablespace "tbls5"

Check constraints:

    "ck_alert_inst_acked" CHECK (acknowledged = 0 OR acknowledged = 1)

    "ck_alert_inst_active" CHECK (active = 0 OR active = 1)

    "ck_alert_inst_set_cl_tm" CHECK (user_set_clear_time = 0 OR user_set_clear_time = 1)

    "ck_alert_inst_viewer_vis" CHECK (viewer_visible = 0 OR viewer_visible = 1)

Foreign-key constraints:

    "fk_alert_inst_cat_id" FOREIGN KEY (category_id) REFERENCES ams.category(category_id)

    "fk_alert_inst_def_id" FOREIGN KEY (alert_definition_id) REFERENCES ams.alert_definition(alert_definition_id)

    "fk_alert_inst_unack_reason" FOREIGN KEY (unack_reason) REFERENCES ams.unack_reason(unack_reason)

Referenced by:

    TABLE "ams.alert_attribute" CONSTRAINT "fk_alert_attr_instance_id" FOREIGN KEY (alert_instance_id) REFERENCES ams.alert_instance(alert_instance_id) ON DELETE CASCADE

Triggers:

    _ams_cluster_logtrigger AFTER INSERT OR DELETE OR UPDATE ON ams.alert_instance FOR EACH ROW EXECUTE PROCEDURE _ams_cluster.logtrigger('_ams_cluster', '1', 'k')

    _ams_cluster_truncatetrigger BEFORE TRUNCATE ON ams.alert_instance FOR EACH STATEMENT EXECUTE PROCEDURE _ams_cluster.log_truncate('1')

Disabled user triggers:

    _ams_cluster_denyaccess BEFORE INSERT OR DELETE OR UPDATE ON ams.alert_instance FOR EACH ROW EXECUTE PROCEDURE _ams_cluster.denyaccess('_ams_cluster')

    _ams_cluster_truncatedeny BEFORE TRUNCATE ON ams.alert_instance FOR EACH STATEMENT EXECUTE PROCEDURE _ams_cluster.deny_truncate()

 

ams=# 

ams=# \d ams.alert_attribute

               Table "ams.alert_attribute"

      Column       |          Type           | Modifiers 

-------------------+-------------------------+-----------

 alert_instance_id | integer                 | not null

 name              | character varying(200)  | not null

 data_type         | smallint                | not null

 value             | character varying(2000) | 

Indexes:

    "pk_alert_attributes" PRIMARY KEY, btree (alert_instance_id, name), tablespace "tbls5"

    "idx_alert_attr_name" btree (name)

Foreign-key constraints:

    "fk_alert_attr_instance_id" FOREIGN KEY (alert_instance_id) REFERENCES ams.alert_instance(alert_instance_id) ON DELETE CASCADE

Triggers:

    _ams_cluster_logtrigger AFTER INSERT OR DELETE OR UPDATE ON ams.alert_attribute FOR EACH ROW EXECUTE PROCEDURE _ams_cluster.logtrigger('_ams_cluster', '2', 'kk')

    _ams_cluster_truncatetrigger BEFORE TRUNCATE ON ams.alert_attribute FOR EACH STATEMENT EXECUTE PROCEDURE _ams_cluster.log_truncate('2')

Disabled user triggers:

    _ams_cluster_denyaccess BEFORE INSERT OR DELETE OR UPDATE ON ams.alert_attribute FOR EACH ROW EXECUTE PROCEDURE _ams_cluster.denyaccess('_ams_cluster')

    _ams_cluster_truncatedeny BEFORE TRUNCATE ON ams.alert_attribute FOR EACH STATEMENT EXECUTE PROCEDURE _ams_cluster.deny_truncate()

 

ams=# 

Tom    :-)

On Wednesday, July 24, 2019, 11:15:04 AM EDT, Adrian Klaver <adrian(dot)klaver(at)aklaver(dot)com> wrote:

On 7/24/19 7:38 AM, Thomas Tignor wrote:
> Hello postgres community,
>
> Writing again to see if there are insights on this issue. We have had
> infrequent but recurring corruption since upgrading from postgres 9.1 to
> postgres 9.5. We are presently on 9.5.16. Our DB-facing app continually
> performs a mixture of DML, primarily inserts and updates on two specific
> tables, with no single op being suspect. In the past, corruption events
> have produced encoding errors on COPY operations (invalid byte sequence
> for encoding "UTF8"). More recently, they have caused segmentation
> faults. We were able to take a cold backup after a recent event.
> SELECTing the corrupted data on our cold backup yields the following
> stack. Any info on a solution or how to proceed towards a solution would
> be much appreciated.
>
> Thanks in advance.
>

In my previous post when I referred to table schema I mean that to
include associated schema like triggers, constraints, etc. Basically
what is returned by \d in psql.

> Tom    :-)

--
Adrian Klaver
adrian(dot)klaver(at)aklaver(dot)com

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Kevin Brannen 2019-07-25 19:23:26 SELECT INTO question
Previous Message Adrian Klaver 2019-07-25 14:55:13 Re: Default ordering option