From: | Dev Nop <devnop0(at)gmail(dot)com> |
---|---|
To: | pgsql-performance(at)postgresql(dot)org |
Subject: | Re: Storing large documents - one table or partition by doc? |
Date: | 2016-09-26 08:27:28 |
Message-ID: | CACjtUOR-YX1NW9iyE=3K0rsAXvwgxHAB9iSjR5WCr3CupfSbxA@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-performance |
>
> If GUIDs *stored in a binary format* were too large, then you won't be
> terribly happy with the 24 byte per-row overhead in Postgres.
Heh. In this case the ids have a life outside the database in various text
formats.
> What I would look into at this point is using int ranges and arrays to
> greatly reduce your overhead:
> CREATE TABLE ...(
> document_version_id int NOT NULL REFERENCES document_version
> , document_line_range int4range NOT NULL
> , document_lines text[] NOT NULL
> , EXCLUDE USING gist( document_version_id =, document_line_range && )
> );
Thanks! Some new things for me to learn about there. Had to read "Range
Types: Your Life Will Never Be The Same" - lol. https://wiki.postgresql.org/
images/7/73/Range-types-pgopen-2012.pdf
To check I understand what you are proposing: the current version and
history is stored in the same table. Each line is referred to by a
sequential line number and then lines are stored in sequential chunks with
range + array. The gist index is preventing any insert with the same
version & line range. This sounds very compact for a static doc but doesn't
it mean lines must be renumbered on inserts/moves?
On Mon, Sep 26, 2016 at 9:26 AM, Dev Nop <devnop0(at)gmail(dot)com> wrote:
> If GUIDs *stored in a binary format* were too large, then you won't be
>> terribly happy with the 24 byte per-row overhead in Postgres.
>
>
> Heh. In this case the ids have a life outside the database in various text
> formats.
>
>
>> What I would look into at this point is using int ranges and arrays to
>> greatly reduce your overhead:
>> CREATE TABLE ...(
>> document_version_id int NOT NULL REFERENCES document_version
>> , document_line_range int4range NOT NULL
>> , document_lines text[] NOT NULL
>> , EXCLUDE USING gist( document_version_id =, document_line_range && )
>> );
>
>
> Thanks! Some new things for me to learn about there. Had to read "Range
> Types: Your Life Will Never Be The Same" - lol.
> https://wiki.postgresql.org/images/7/73/Range-types-pgopen-2012.pdf
>
> To check I understand what you are proposing: the current version and
> history is stored in the same table. Each line is referred to by a
> sequential line number and then lines are stored in sequential chunks with
> range + array. The gist index is preventing any insert with the same
> version & line range. This sounds very compact for a static doc but doesn't
> it mean lines must be renumbered on inserts/moves?
>
>
From | Date | Subject | |
---|---|---|---|
Next Message | Álvaro Hernández Tortosa | 2016-09-26 08:28:54 | Re: Millions of tables |
Previous Message | julyanto SUTANDANG | 2016-09-26 06:28:10 | Re: Millions of tables |