From: | Ian Lance Taylor <ian(at)airs(dot)com> |
---|---|
To: | Joseph Shraibman <jks(at)selectacast(dot)net> |
Cc: | PostgreSQL General <pgsql-general(at)postgresql(dot)org> |
Subject: | Re: Weird indices |
Date: | 2001-02-21 07:25:46 |
Message-ID: | sig0h8ijk5.fsf@daffy.airs.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
Joseph Shraibman <jks(at)selectacast(dot)net> writes:
> > Note that this all implies that when walking through the index to find
> > heap tuples, you must check the current validity of each heap tuple.
> > It is normal for an index tuple to point to a heap tuple which has
> > been deleted.
>
> <snip>
> > >
> > > I'm talking about indices. The index should be updated to only point at
> > > valid rows.
> >
> > When should the index be updated to only point at valid rows? That is
> > only possible when a heap tuple is finally and completely removed.
> > But currently that is only known at the time of a VACUUM.
> >
> You just said above 'It is normal for an index tuple to point to a heap
> tuple which has been deleted.'
I'm not sure what your point is here.
I can see that there is an ambiguity in what I said. I said it is
normal for an index tuple to point to a heap tuple which has been
deleted. However, although the heap tuple has been deleted in present
time, there may still be transactions which do not see that heap tuple
as having been deleted. So the index tuple is still meaningful until
all those transactions have completed.
When all such transactions have completed is the case I meant when I
described the heap tuple as finally and completely removed.
> > Consider a transaction which sits around for a while and then aborts.
> > At the moment that the transaction aborts, Postgres may become able to
> > remove some heap tuples and some index tuples, and it might be invalid
> > for Postgres to remove those tuples before the transaction aborts.
> >
> > But the transaction might never have looked at those tuples. So,
> > given the Postgres data structures, the only way to keep an index
> > fully up to date would be to effectively run a VACUUM over the entire
> > database every time a transaction aborts.
>
> Why? There is a mechanism for keeping track of which heap tuples are
> valid, why not index tuples? It is the nature of indices to be updated
> on inserts, why not deletes? I would think that the advantage of being
> able to use the index in the planner would outweigh the immediate cost
> of doing the update.
You're right. The mechanism used to preserve multiple versions of
heap tuples could be extended to index tuples as well.
Based on the heap tuple implementation, this would require adding two
transaction ID's and a few flags to each index tuple. That's not
insignificant. In a B-tree, right now, I think there are 8 bytes plus
the key for each item in the tree. This would require adding another
10 bytes or so. That's a lot.
Also, more work would be required for every update. Right now an
update requires a B-tree insert for each index. With this change,
every update would require an additional B-tree lookup and write for
each index. That would require on average a bit less than one
additional block write per index. That's a lot.
In exchange, certain queries would become faster. Specifically, any
query which only needed the information found in an index would become
faster. Each such query would save on average a bit less than one
additional block read per value found in the index. But since the
indices would be less efficient, some of the advantage would be lost
due to extra block reads of the index.
What you are suggesting seems possible, but it does not seem to be
obviously better.
If you feel strongly about this, the most reasonable thing would be
for you to implement it, and test the results. Since as far as I can
see what you are suggesting is not clearly better, it's unlikely that
anybody else is going to go to the considerable effort of implement it
on your behalf.
> > > But if the index isn't used by the planner then the point
> > > is moot.
> >
> > As far as I know the index itself isn't used by the planner.
> >
> But could be. As I understand it the reason the index isn't used by the
> planner is because the index could point at non-visible rows (row = heap
> tuple). If the index could be used, many things now that are seq scans
> could be converted to faster index scans.
Some things could, sure. It's not obvious to me that many things
could. The planner can't spend a lot of time looking at an index to
decide whether or not to use it. If it's going to do that, it's
better off to just decide to use the index in the first place. Index
examination is not free. It requires disk reads just like everything
else.
> > I don't think there is any way to do that today. It would be possible
> > to implement something along the lines I suggest above. I have no
> > idea if the Postgres maintainers have any plans along these lines.
> >
> At the end of a transaction, when it sets the bit that this tuple isn't
> valid, couldn't it at the same time also remove it if was no longer
> visible to any transaction? It wouldn't remove the need for vacuum
> because there may be another transaction that prevents it from being
> removed right then and there.
Yes, this could be done. It wouldn't speed things up, though. In
fact, it would slow them down. The only advantage would be that
VACUUM would be required less often--an advantage which is not
insignificant.
I would guess that in the average multi-user database less than half
of the tuples could be deleted at that point. It would be easy to
instrument Postgres to test this--why don't you try that?
Ian
From | Date | Subject | |
---|---|---|---|
Next Message | Steff | 2001-02-21 07:30:40 | problems with copy |
Previous Message | Tom Lane | 2001-02-21 06:33:01 | Re: Bug in my ( newbie ) mind? |