Re: HOT WIP Patch - version 6.3

From: Bruce Momjian <bruce(at)momjian(dot)us>
To: Pavan Deolasee <pavan(dot)deolasee(at)gmail(dot)com>
Cc: pgsql-patches(at)postgresql(dot)org
Subject: Re: HOT WIP Patch - version 6.3
Date: 2007-04-02 23:17:11
Message-ID: 200704022317.l32NHBH18958@momjian.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-patches


Your patch has been added to the PostgreSQL unapplied patches list at:

http://momjian.postgresql.org/cgi-bin/pgpatches

It will be applied as soon as one of the PostgreSQL committers reviews
and approves it.

---------------------------------------------------------------------------

Pavan Deolasee wrote:
> Please see the HOT version 6.3 patch attached with the mail.
> I've implemented support for CREATE INDEX and CREATE INDEX
> CONCURRENTLY based on the recent discussions. The implementation
> is not yet complete and needs some more testing/work/discussion
> before we can start considering it for review.
>
> One of the regression test case fails because CIC now works in
> three phases. In the first phase, we just create the catalog entry
> for the index and commit the transaction. If the index_build fails
> because of any error (say, unique key constraint) the index creation
> fails, but the catalog entry remains.
>
> CREATE INDEX:
> -----------------
>
> The implementation is based on having an extra attribute in pg_index
> to track the transaction xid which created the index and then use
> that information to decide whether the newly created index should
> be used in a query or not. Here are couple of TODO items:
>
> Plan Invalidation:
>
> We decided to store transaction id of the top level transaction in
> the cached plan if one or more potentially useful indexes are
> not available while planning a query. And then replan if the
> current transaction id is different that the one stored with the
> plan. I'm not very well familiar with this code, so any suggestions
> how to do it in a clean way ?
>
> Making index available in the creating transaction:
>
> This is an important TODO item. We would like to make the
> index immediately available to the transaction which created it,
> if the transaction is running in read-committed mode. If the
> transaction is running in SERIALIZABLE mode, then we can't do
> much because we might have skipped one or more RECENTLY_DEAD
> tuples while building the index and hence index can not be used.
>
> The way we build index now is that we only index the tuple at the head
> of the HOT-chain. So there could be DELETE_IN_PROGRESS
> tuples (updated/deleted by the transaction which is creating the
> index) which we skipped while building the index. My question
> is, is there a case where this transaction may use the new index
> and still see those tuples ? I know that the DELETE_IN_PROGRESS
> tuples are visible if there are any open cursors. But then plans for
> these open cursors can not be changed until they are closed
> and reopened, isn't it ? Tom mentioned about recursive plpgsql
> functions where the outer instance can use an older snapshot.
> I tried that but could not produce a scenario where the outer instance
> could see the DELETE_IN_PROGRESS tuple if the tuple is updated
> in the inner instance. Can someone help me with an example where
> a read-committed transaction would use the newly created index
> and still see the DELETE_IN_PROGRESS tuple ?
>
>
> CREATE INDEX CONCURRENTLY:
> ------------------------------
>
> One of the item which needs review and discussion is the handling
> on unique key checks while creating the index concurrently. We build
> the index in three phases. In the first phase, we just create the catalog
> entry and mark index invalid for inserts. This ensures that transactions
> started after that won't create HOT-chains that break the HOT property
> for the new index. In the second phase, we build the index by applying
> the reference snapshot to the heap tuples. In the third phase, we
> validate the index and insert any missing entries.
>
> In this phase, we only insert if index entry for the root tuple is missing.
> So there is just one insert operation which covers all the tuples in the
> HOT-chain. In order to check unique key violations, inside
> _bt_check_unique() function when a duplicate key is found, we follow
> the entire HOT-chain and check if any tuple in the chain is live. If so,
> unique key violation constraint is raised. IOW if any two HOT-chains
> share the same key and have one live tuple, unique key constraint
> is considered violated. Can anyone spot a hole in this logic ?
>
>
> Thanks,
> Pavan
>
> --
>
> EnterpriseDB http://www.enterprisedb.com

[ Attachment, skipping... ]

>
> ---------------------------(end of broadcast)---------------------------
> TIP 6: explain analyze is your friend

--
Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
EnterpriseDB http://www.enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +

In response to

Responses

Browse pgsql-patches by date

  From Date Subject
Next Message Bruce Momjian 2007-04-02 23:25:46 Re: guc patch: Make variables fall back to default values
Previous Message Bruce Momjian 2007-04-02 23:10:58 Re: scan_recycle_buffers