From: | "Herouth Maoz" <herouth(at)unicell(dot)co(dot)il> |
---|---|
To: | "Steve Crawford" <scrawford(at)pinpointresearch(dot)com>, <pgsql-general(at)postgresql(dot)org> |
Subject: | Re: Table partitioning |
Date: | 2013-10-28 21:07:19 |
Message-ID: | FB6D59EFE2A13D4B9A3E85F0C3363EED06AF9881@mail.UniCell.local |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
Everything is slow about it - selects, deletes and inserts, that is. I don't do updates on that table.
The inserts and deletes are less of an issue because they are done once a week. Of course it would be nicer if they were faster, but that's less of an issue. The real issue is with self-joins, which are a common query. But I have indexes on the relevant fields: the connecting field (the one used for the self-join) as well as the date field. The queries are mostly of the format SELECT ... FROM tablename t1 left join tablename t2 on t1.indexed_field = t2.indexed_field and t2.boolean_field where t1.date_field >= 'date1' and t1.date_field < 'date2' and not t1.boolean; This can take about 15 minutes to run, depending on the date range. And it doesn't matter much if I put a date range within the left join, either.
As I mentioned in another message, I don't like the idea that my script may fail, and then inserts into the table might start to fail as well. I'm not always available to run it manually within a set time. And writing an automation that is different than all the other tables I maintain in that database makes for maintenance spaghetti. I also don't like running automated DDL commands. They don't play well with backups.
-----הודעה מקורית-----
מאת: Steve Crawford [mailto:scrawford(at)pinpointresearch(dot)com]
נשלח: ב 28/10/2013 22:31
אל: Herouth Maoz; pgsql-general(at)postgresql(dot)org
נושא: Re: [GENERAL] Table partitioning
On 10/28/2013 09:27 AM, Herouth Maoz wrote:
> I have a rather large and slow table in Postgresql 9.1. I'm thinking of partitioning it by months, but I don't like the idea of creating and dropping tables all the time.
What is slow about it? Inserting? Selecting? Deleting? Partitioning can
assist with some issues but does no good if what you really need is an
index or better query. Partitioning shines as an option to manage
archiving/purging of time-series data but only if you work with it, not
against it.
What don't you like about creating and dropping tables? You can easily
automate it: https://github.com/keithf4/pg_partman
>
> I'm thinking of simply creating 12 child tables, in which the check condition will be, for example, date_part('month'', time_arrived) = 1 (or 2 for February, 3 for March etc.).
>
> I'll just be deleting records rather than dropping tables, the same way I do in my current setup. I delete a week's worth every time.
You are missing out on one of the best aspects of partitioning. Compared
to dropping or truncating a child table, deleting is far slower and
causes table bloat which may impact future queries.
>
> Second, when I delete (not drop!) from the mother table, are records deleted automatically from the child tables or do I need to create rules/triggers for that?
>
Yes unless you use the keyword "ONLY": "If specified, deletes rows from
the named table only. When not specified, any tables inheriting from the
named table are also processed."
Cheers,
Steve
From | Date | Subject | |
---|---|---|---|
Next Message | Merlin Moncure | 2013-10-28 21:11:24 | Re: Cursor Example Needed |
Previous Message | Perry Smith | 2013-10-28 21:06:30 | Cursor Example Needed |