From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | "Pavan Deolasee" <pavan(dot)deolasee(at)gmail(dot)com> |
Cc: | PostgreSQL-patches <pgsql-patches(at)postgresql(dot)org> |
Subject: | Re: HOT patch - version 14 |
Date: | 2007-08-30 15:16:39 |
Message-ID: | 1314.1188486999@sss.pgh.pa.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-patches |
"Pavan Deolasee" <pavan(dot)deolasee(at)gmail(dot)com> writes:
> You are right - a new index might mean that an existing HOT chain
> is broken as far as the new index is concerned. The way we address
> that is by indexing the root tuple of the chain, but the index key is
> extracted from the last tuple in the chain. The index is marked "unusable"
> for all those existing transactions which can potentially see any
> intermediate tuples in the chain.
I don't think that works --- what if the last tuple in the chain isn't
committed good yet? If its inserter ultimately rolls back, you've
indexed the wrong value.
> Please see this document written by Greg Stark. He has nicely summarized
> how CREATE INDEX and CREATE INDEX CONCURRENTLY works with HOT.
> http://archives.postgresql.org/pgsql-patches/2007-07/msg00360.php
Isn't the extra machination for C.I.C. just useless complication?
What's the point of avoiding creation of new broken HOT chains when
you still have to deal with existing ones?
>> I also don't think I
>> believe the reasoning for not indexing DELETE_IN_PROGRESS hot-updated
>> tuples: what if the index completion commits, but the concurrent delete
>> rolls back? Then you've got a valid tuple that's not in the index.
> Since CREATE INDEX works with ShareLock on the relation, we can
> safely assume that we can't find any DELETE_IN_PROGRESS tuples except
> those deleted by our own transaction. The only exception is system relation,
> but we don't do HOT updates on system relation.
That chain of reasoning is way too shaky for me. Essentially what
you're saying is you'll promise not to corrupt non-system indexes.
Nor am I really thrilled about having to disable HOT for system
catalogs.
> The only reason to redefine MaxHeapTuplesPerPage to higher side is
> because HOT allows us to accommodate more tuples per page because
> of redirect-dead line pointers.
No, it doesn't allow more tuples per page. It might mean there can be
more line pointers than that on a page, but not more actual tuples.
The question is whether there is any real use in allowing more line
pointers than that --- the limit is already unrealistically high,
since it assumes no data content in any of the tuples. If there is a
rationale for it then you should invent a different constant
MaxLinePointersPerPage or some such, but I really think there's no
point.
> Doubling the value is chosen as a balance between heap page
> utilization, line pointer bloating and overhead for bitmap scans.
I'd say it allows a ridiculous amount of line pointer bloat.
>> Even if it's safe, ISTM what you're mostly accomplishing there is to
>> expend a lot of cycles while holding exclusive lock on the page, when
>> there is good reason to think that you're blocking other people who are
>> interested in using the page. Eliminating the separation between that
>> and cleanup would also allow eliminating the separate "PD_FRAGMENTED"
>> page state.
> The reason we did it that way because repairing fragmentation seems
> much more costly that pruning. Please note that we prune a single
> chain during index fetch. Its only for heap-scans (and VACUUM) that
> we try to prune all chains in the page. So unless we are doing heap-scan,
> I am not sure if we are spending too much time holding the exclusive
> lock. I agree we don't have any specific numbers to prove that though.
If you don't have numbers proving that this extra complication has a
benefit, I'd vote for simplifying it. The SnapshotAny case is going to
bite other people besides you in future.
> Another reasoning behind separating these two steps is: pruning
> requires exclusive lock whereas repairing fragmentation requires
> cleanup lock.
This is nonsense. Those are the same lock. If you have the former and
not the latter, it just means that you *know* there is contention for
the page. It seems to me that performing optional maintenance work in
such a situation is completely wrong.
regards, tom lane
From | Date | Subject | |
---|---|---|---|
Next Message | Gavin M. Roy | 2007-08-30 15:39:30 | Re: pg_dump --no-tablespaces patch |
Previous Message | Gregory Stark | 2007-08-30 10:20:04 | Re: HOT patch - version 14 |