| From: | Noah Misch <noah(at)leadboat(dot)com> |
|---|---|
| To: | pgsql-hackers(at)postgresql(dot)org |
| Subject: | On-the-fly index tuple deletion vs. hot_standby |
| Date: | 2010-11-29 06:10:38 |
| Message-ID: | 20101129061038.GA10883@tornado.gateway.2wire.net |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
I have a hot_standby system and use it to bear the load of various reporting
queries that take 15-60 minutes each. In an effort to avoid long pauses in
recovery, I set a vacuum_defer_cleanup_age constituting roughly three hours of
the master's transactions. Even so, I kept seeing recovery pause for the
duration of a long-running query. In each case, the culprit record was an
XLOG_BTREE_DELETE arising from on-the-fly deletion of an index tuple. The
attached test script demonstrates the behavior (on HEAD); the index tuple
reclamation conflicts with a concurrent "SELECT pg_sleep(600)" on the standby.
Since this inserting transaction aborts, HeapTupleSatisfiesVacuum reports
HEAPTUPLE_DEAD independent of vacuum_defer_cleanup_age. We go ahead and remove
the index tuples. On the standby, btree_xlog_delete_get_latestRemovedXid does
not regard the inserting-transaction outcome, so btree_redo proceeds to conflict
with snapshots having visibility over that transaction. Could we correctly
improve this by teaching btree_xlog_delete_get_latestRemovedXid to ignore tuples
of aborted transactions and tuples inserted and deleted within one transaction?
Thanks,
nm
| Attachment | Content-Type | Size |
|---|---|---|
| repro-btree-cleanup.sh | application/x-sh | 1.5 KB |
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Itagaki Takahiro | 2010-11-29 06:20:32 | Re: pg_execute_from_file review |
| Previous Message | Joachim Wieland | 2010-11-29 05:11:46 | Re: directory archive format for pg_dump |