Quick Links

Re: Need advice for handling big data in postgres

From:	Vincent Veyron <vv(dot)lists(at)wanadoo(dot)fr>
To:	tobias(at)streethawk(dot)com
Cc:	pgsql-general(at)postgresql(dot)org
Subject:	Re: Need advice for handling big data in postgres
Date:	2015-01-15 13:42:08
Message-ID:	20150115144208.82c77ee4f6e03ae843a81687@wanadoo.fr
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Thread:
Lists:	pgsql-general

On Wed, 14 Jan 2015 11:42:45 +1100
Tobias Fielitz <tobias(at)streethawk(dot)com> wrote:
>
> OPTION 1 - PARTITIONING:
> For each query only a few columns are interesting and I could partition
> the table (as it was suggested on SO)
> by *created* and by *code*.
> There is roughly 10 different codes and I would keep data for the last
> two months (partitioned by day). So I would end up having 10 * 60 = 600
> partitions.
> For every partition I could create indexes that are relevant to that
> partition (examples: *created*, *created_on_server* or *latitude* and
> *longitude*).
>
> OPTION 2 - MULTIPLE TABLES:
> I could create the tables myself: one for location log lines, one for
> comment log lines etc. and store them via python in the correct table
> (depending on *code*). Each of these tables would only have the columns
> and indexes needed.
>
> OUTCOME:
> I expect partitioning to be faster because Postgres selects the correct
> partition for me automatically. I can easily get rid of old data by
> dropping the corresponding partition. The downside of the partition
> approach is that all partitions inherit all columns of the master table
> which is unnecessary (and consumes disc space?).
>

I gather from the comments in this list that null fields have a very low overhead; see :

http://www.postgresql.org/message-id/87prx92lj9.fsf@oxford.xeocode.com

I would worry a lot more about the maintenance problems option 2 will induce : if a code value changes or is added/deleted, your python script needs updating. SQL queries will also be a lot more complicated (union select on various tables) and harder to optimize. Maintaining the coherence between the script and the tables will get harder and harder.

--
Salutations, Vincent Veyron

https://libremen.com/
Legal case, contract and insurance claim management software

In response to

Need advice for handling big data in postgres at 2015-01-14 00:42:45 from Tobias Fielitz

Responses

Re: Need advice for handling big data in postgres at 2015-01-16 01:18:12 from Jim Nasby

Browse pgsql-general by date

	From	Date	Subject
Next Message	Vick Khera	2015-01-15 14:04:23	Re: Indexing large table of coordinates with GiST
Previous Message	Daniel Begin	2015-01-15 12:44:48	Indexing large table of coordinates with GiST