From: | Hannu Krosing <hannu(at)skype(dot)net> |
---|---|
To: | Gokulakannan Somasundaram <gokul007(at)gmail(dot)com> |
Cc: | Heikki Linnakangas <heikki(at)enterprisedb(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: [PATCHES] Including Snapshot Info with Indexes |
Date: | 2007-10-23 13:25:47 |
Message-ID: | 1193145947.17735.33.camel@hannu-laptop |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers pgsql-patches |
Ühel kenal päeval, T, 2007-10-23 kell 18:36, kirjutas Gokulakannan
Somasundaram:
>
> There are several advantages to keeping a separate visibility
> heap:
>
> 1) it is usually higly compressible, at least you can throw
> away
> cmin/cmax quite soon, usually also FREEZE and RLE encode the
> rest.
>
> 2) faster access, more tightly packed data pages.
>
> 3) index-only scans
>
> 4) superfast VACUUM FREEZE
>
> 5) makes VACUUM faster even for worst cases (interleaving live
> and dead
> tuples)
>
> 6) any index scan will be faster due to fetching only visible
> rows from
> main heap.
>
> if you have to store the visibility fields of all the tuples of each
> table, then you may not be able to accomodate in the cache. Say if a
> table is of 1 million rows, we would need 22 MB of visibility
> space(since visibility info takes 16 bytes. I think if we have to link
> it with say tuple-id(6 Bytes).
You can keep the visibility info small, by first dropping cmin/cmax and
then FREEZ'ing the tuples (setting xmin to special value), after that
you can replace a lot of visibility info tuples with single RLE encoded
tuple, which simply states, that tuples N:A to M:B are visible.
If that 1 million row table is mostly static, the static parts will soon
have (al lot) less than 1 bit in visibility heap.
For example, after vacuum there will be just one visibility info which
say that whole table is visible.
I envision HOT-like on-the-fly VACUUM FREEZE manipulations of visibility
info so it won't grow very big at all.
> I think we may need to link it with indexes with one more id. i am not
> counting that now).
why ?
we will keep visibility info for ctids (PAGE:NR) and if we need to see,
if any ctid pointe from index points to a visible tuple we check it
based on that ctid.
> If we have 10 tables, then we will have 220 MB. Keeping them pinned
> in memory may not be advisable in some circumstances.
no no! no pinning, the "mostly in cache" will happen automatically (and
I mean mostly in processors _internal_ L1 or L2 cache, not just in RAM)
> If it is not going to be in memory, then that is no different from
> referring a table. But i accept that is a concept worth trying out. I
> think the advantage with thick indexes comes with the fact, that it is
> optional. If we can make this also as optional, that would be better.
> But if we are going to suggest it as a replacement of DSM, then it
> loses the advantage of being small.
I agree that a single-purpose DSM can be made smaller than multi-purpose
visibility heap.
------------------
Hannu
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2007-10-23 13:29:58 | Re: Feature Freeze date for 8.4 |
Previous Message | Heikki Linnakangas | 2007-10-23 13:16:58 | Re: [PATCHES] Including Snapshot Info with Indexes |
From | Date | Subject | |
---|---|---|---|
Next Message | Hannu Krosing | 2007-10-23 13:50:22 | Re: [PATCHES] Including Snapshot Info with Indexes |
Previous Message | Heikki Linnakangas | 2007-10-23 13:16:58 | Re: [PATCHES] Including Snapshot Info with Indexes |