Re: CREATE INDEX and HOT - revised design

From: Bruce Momjian <bruce(at)momjian(dot)us>
To: Pavan Deolasee <pavan(dot)deolasee(at)enterprisedb(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Simon Riggs <simon(at)2ndquadrant(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: CREATE INDEX and HOT - revised design
Date: 2007-03-21 02:36:57
Message-ID: 200703210236.l2L2avh01156@momjian.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers


I have read the HOT discussion and wanted to give my input. The major
issue is that CREATE INDEX might require a HOT chain to be split apart
if one of the new indexed columns changed in the HOT chain.

As for the outline below, there is no way we are going to add new ALTER
TABLE and CHILL commands to make this work. Can you imagine what kind
of warts we would have in the system if we added such things every time
we got stuck? And once the warts are in, it is hard to educate people
once they are removed. We need to keep going until we have a solution
that is as user-invisible as possible. While I understand the
frustration that we have not discussed this enough, I don't want us
rushing to a solution either until it has been totally thought through.

HOT is not a feature only a few people are going to want to use ---
everyone will want it, and if the user process is cumbersome, we will get
never-ending questions about how to make it work.

Let's all think about this for the next few days.

---------------------------------------------------------------------------

Pavan Deolasee wrote:
>
>
> There are few things I realized over the weekend while going
> through the code:
>
> 1. It looks like a bad idea to use ALTER TABLE .. to chill a table
> becuase ALTER TABLE takes AccessExclusive lock on the table.
> But it would still be a good idea to have ALTER TABLE .. to turn
> HOT-updates ON/OFF.
>
> 2. May be I was too paranoid about deadlocks. CREATE INDEX upgrade
> its lock anyways and is prone to deadlock. So as long as we don't
> create new deadlock scenarios, we should be fine.
>
> 3. CREATE INDEX CONCURRENTLY must be run as a seperate transaction.
> So its should be acceptable if we run CHILL as a seperate transaction.
>
> 4. CREATE INDEX CONCURRENTLY runs with ShareUpdateExclusiveLock and
> thus locks out concurrent CREATE INDEX [CONCURRENTLY], VACUUM and
> VACUUM FULL. We can do the same for CHILL to avoid any unnecessary
> race conditions between all of these.
>
>
> So here is my plan. Let me know your suggestions/comments/objections.
>
>
> Changes to pg_class and new DDLs:
> ---------------------------------
>
> We add two boolean attributes to pg_class: hot_update and hot_fetch.
>
> We introduce two DDLs to turn HOT on/off. Note that the DDL
> itself does not CHILL the table, but only affects the
> subsequent UPDATEs.
>
> postgres=# ALTER TABLE <tblname> ENABLE HOT;
> postgres=# ALTER TABLE <tblname> DISABLE HOT;
>
> These DDLs would acquire AccessExclusive lock on the table and
> set hot_update to true/false using simple_heap_update()
>
>
> CREATE INDEX [CONCURRENTLY]:
> ----------------------------
>
> If a HEAP_ONLY tuple is found, error out with a HINT to run
> CHILL on the table and then retry.
>
> If CHILL_IN_PROGRESS tuple is found, VACUUM is required on
> the table.
>
>
> CHILL utility:
> --------------
>
> We introduce a new command to chill a table. The syntax for the
> same could be:
>
> postgres=# CHILL [VERBOSE] <tblname>;
>
> UPDATE/INSERT/SELECT would work while the table is being chilled.
> But CREATE INDEX [CONCURRENTLY], VACUUM [FULL] and another CHILL
> would be locked out. As a side-effect, HOT-updates are turned off on the
> table and explicit ALTER TABLE ENABLE HOT is required to turn
> HOT-updates on again.
>
> Here is the algoirthm to CHILL table.
>
> 1. Check if CHILL is running inside a transaction block, error
> out if so.
>
> 2. Start a new transaction
>
> 3. Acquire ShareUpdateExclusiveLock on the relation. This would
> allow concurrent SELECT/INSERT/UPDATE, but lock out concurrent
> CHILL, VACUUM [FULL] and CREATE INDEX [CONCURRENTLY]
>
> 4. Set hot_update to false and update pg_class using
> simple_heap_update()
>
> 5. Acquire ShareUpdateExclusiveLock for the entire session.
>
> 6. Commit the transaction
>
> 7. Start a new transaction
>
> 8. Wait for all transactions in the current snapshot to finish.
> This would ensure that there are no HOT-updates possible further
>
> 9. Start the first heap scan, CHILL HEAP_ONLY tuples by inserting
> appropriate index entries and setting CHILL_IN_PROGRESS flag.
> WAL log the operation
>
> 10. If a CHILL_IN_PROGRESS is already set on the tuple, we can't
> be sure whether the corresponding index entry already exists
> or not. One option is to error out and force VACUUM on the table.
> Alternatively, the index_insert can be enhanced to check if a
> the same entry already exists.
>
> 11. When the entire heap is chilled, set hot_fetch to false
> and update pg_class using simple_heap_update()
>
> 12. Commit the transaction
>
> 13. Start a new transaction
>
> 14. Wait for all transactions in the current snapshot to finish.
> This would ensure that all the subsequent index scans would
> only use direct path from the index.
>
> 15. Start second heap scan. Reset CHILL_IN_PROGRESS, HEAP_ONLY
> and HOT_UPDATED flags. WAL log the operations. We may not need
> this, but we can revisit this later to optimize WAL logging.
>
> 16. When the second scan is complete, set hot_fetch to
> true and update pg_class using simple_heap_update(). There are
> no HOT_UPDATED tuples in the heap at this moment, but we should
> reset the state neverthless.
>
> 17. Commit the transaction.
>
> If the CHILL command crashes before completing the operation,
> we might be left with hot_update/hot_fetch turned OFF. Administrative
> command is needed to turn them ON again. But there won't be any
> correctness problems in the meantime.
>
> The uncleaned tuples left with CHILL_IN_PROGRESS flags would
> require VACUUM for cleanup.
>
> Index Fetch:
> ------------
>
> If hot_fetch is true, we ignore direct paths from the
> index to HEAP_ONLY tuples
>
> If hot_fetch is false, we ignore HOT_UPDATED flags
> and only use direct paths from the index.
>
> VACUUM [FULL]:
> --------------
>
> If a CHILL_IN_PROGRESS flag found, collect that tuple for
> index removal irrespective of whether the tuple is DEAD
> or not. If the tuple is not DEAD, only the CHILL_IN_PROGRESS
> flag is reset in the second pass.
>
>
> Is the plan acceptable ? If there are no objections to the
> algorithms or the behavior in general, I would start working
> on this with a target of feature freeze.
>
>
> Thanks,
> Pavan
>
>
> --
>
>
> EnterpriseDB http://www.enterprisedb.com
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 9: In versions below 8.0, the planner will ignore your desire to
> choose an index scan if your joining column's datatypes do not
> match

--
Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
EnterpriseDB http://www.enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Dennis Bjorklund 2007-03-21 04:34:39 Re: Money type todos?
Previous Message Germán Poó Caamaño 2007-03-21 02:26:22 GSoC's possible project