From: | pgsql(at)mohawksoft(dot)com |
---|---|
To: | "Bort, Paul" <pbort(at)tmwsystems(dot)com> |
Cc: | pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: New form of index "persistent reference" |
Date: | 2005-02-10 18:12:19 |
Message-ID: | 16834.24.91.171.78.1108059139.squirrel@mail.mohawksoft.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
> If that ID is the only thing you use to access that data, why not just
> store
> it in a flat file with fixed-length records? seek() (or your language's
> equivalent) is usually fast.
As a matter of policy, I would never manage data outside of the database.
>
> If you need to drive that from within PostgreSQL, you would need an
> untrusted language to read the file, but you could also generate it from a
> table using a trigger.
Very ugly.
>
> Or maybe use a serial column, an index on that column, and cluster the
> table
> on that index. It's more than one lookup, but not much with a Btree index.
> (Not sure if this is better than just using a serial and an index.
> http://www.postgresql.org/docs/8.0/interactive/sql-cluster.html says it
> isn't, if I read it correctly.)
Clustering is OK, but it doesn't handle updates and additions until you
recluster the data.
If a static reference is all that is needed, then merely using CTID would
suffice. I was thinking a little overhead for a reference table would
allow it to hook into PostgreSQL and keep it up to date.
>
> Then anytime there is a batch of updates to the table, re-cluster it.
Yea, like I said, there are easier ways of doing that with fairly static
data.
>
>> -----Original Message-----
>> From: pgsql(at)mohawksoft(dot)com [mailto:pgsql(at)mohawksoft(dot)com]
>> Sent: Thursday, February 10, 2005 11:22 AM
>> To: pgsql-hackers(at)postgresql(dot)org
>> Subject: [HACKERS] New form of index "persistent reference"
>>
>>
>> For about 5 years now, I have been using a text search engine
>> that I wrote
>> and maintain.
>>
>> In the beginning, I hacked up function mechanisms to return
>> multiple value
>> sets and columns. Then PostgreSQL aded "setof" and it is was
>> cool. Then it
>> was able to return a set of rows, which was even better.
>>
>> Lately, I have been thinking that a cool form of index would
>> be some sort
>> of "persistent reference" index. Like the old ISAM days of
>> yore, a fixed
>> number could point you right to the row that you want. I'm
>> not sure if the
>> "persistent reference" is a specific auto numbering column type or
>> separate index structure or both.
>>
>> I asked the question how do you get a record without going through an
>> index, the answer was CTID, which unfortunately changes when
>> the row is
>> updated.
>>
>> Now, what I want to brainstorm is some sort of "persistent reference"
>> where the value is not algorithmically stored, maybe just an
>> offset into a
>> table. The number of operations should be about 1 per lookup.
>>
>> Imagine a dynamically growing array that has one slot per
>> row. Every row
>> is considered unique. Rows which are updated, their CTID is
>> updated in the
>> reference. (with vacuum?)
>>
>> Imagine something like this:
>>
>> create table foobar(id reference, name varchar, value varchar);
>>
>> select * from foobar where id = 100;
>>
>> The reference type has an implicit index that is basically a
>> lookup table.
>> On unique references where the reference value is fairly
>> arbitrary, this
>> would be a HUGE gain for direct lookups. There is no need for
>> the NlogN of
>> a tree.
>>
>> On the surface level, this would be a huge win for websites that use
>> semi-fixed tables of data.
>>
>>
>>
>> ---------------------------(end of
>> broadcast)---------------------------
>> TIP 1: subscribe and unsubscribe commands go to
>> majordomo(at)postgresql(dot)org
>>
>
From | Date | Subject | |
---|---|---|---|
Next Message | Merlin Moncure | 2005-02-10 18:26:34 | Re: New form of index "persistent reference" |
Previous Message | Bort, Paul | 2005-02-10 17:55:10 | Re: New form of index "persistent reference" |