From: | Martin Gregorie <martin(at)gregorie(dot)org> |
---|---|
To: | pgsql-general(at)postgresql(dot)org |
Subject: | Re: Versioned, chunked documents |
Date: | 2012-04-01 23:22:02 |
Message-ID: | 1333322522.1754.143.camel@zappa.gregorie.org |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
On Mon, 2012-04-02 at 00:38 +0200, Ivan Voras wrote:
> Hi,
>
> I have documents which are divided into chunks, so that the (ordered)
> concatenation of chunks make the whole document. Each of the chunks may
> be edited separately and past versions of the chunks need to be kept.
>
> The structure looks fairly simple:
>
> The first goal is to retrieve the latest version of the whole document,
> made from the latest versions of all chunks, but later the goal will
> also be to fetch the whole version at some point in time (i.e. with
> chunks created before a point in time).
>
> I did the first goal by creating two helper views:
>
> CREATE VIEW documents_chunks_last_version_chunk_ids AS
> SELECT documents_id, max(id), seq FROM documents_chunks GROUP BY
> documents_id, seq;
>
> CREATE VIEW documents_chunks_last_version_content AS
> SELECT documents_chunks.documents_id, content
> FROM documents_chunks
> JOIN documents_chunks_last_version_chunk_ids ON
> documents_chunks.id=documents_chunks_last_version_chunk_ids.max
> ORDER BY documents_chunks_last_version_chunk_ids.seq;
>
> There are indexes on the document_chunks fields seq and documents_id.
>
> Everything looked fine until I examined the output of EXPLAIN ANALYZE
> and saw this:
>
I'm not surprised. First guess: I'd use
id (FK of documents), seq, ctime
as the prime key of document_chunk, which would work for your initial
requirement but is far too simplistic to deal with the general
requirement of retrieving a specific document version. You'd probably
need something like:
CREATE TABLE documents (
id SERIAL PRIMARY KEY,
title TEXT NOT NULL,
...
);
Create table version (
version_no serial primary key,
author var char 40,
ctime timestamp );
create table document_version (
id serial references document(id),
version_number serial references version(version_no),
primary_key (id, version_no),
);
CREATE TABLE documents_chunks (
id SERIAL references document_version(id),
version_number serial references document_version(version_number),
seq serial,
content TEXT,
primary_key(id, version_number, seq)
};
Disclaimer: this is not syntax checked or tested. It may/or may not
match your requirements, but since I haven't seen your ERD or the 3NF
you derived from it I can't offer any more applicable advice.
Martin
From | Date | Subject | |
---|---|---|---|
Next Message | Martin Gregorie | 2012-04-01 23:52:35 | Re: Versioned, chunked documents |
Previous Message | Ivan Voras | 2012-04-01 23:05:58 | Re: Versioned, chunked documents |