From: | John R Pierce <pierce(at)hogranch(dot)com> |
---|---|
To: | Dave Huber <DHuber(at)letourneautechnologies(dot)com> |
Cc: | "pgsql-general(at)postgresql(dot)org" <pgsql-general(at)postgresql(dot)org> |
Subject: | Re: automated row deletion |
Date: | 2009-10-07 17:01:05 |
Message-ID: | 4ACCC951.80303@hogranch.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
Dave Huber wrote:
>
> A colleague gave me the following query to run:
>
>
>
> DELETE FROM data_log_20msec_table WHERE (log_id IN (SELECT log_id FROM
> data_log_20msec_table ORDER BY log_id DESC OFFSET 10000000))
>
> ...
>
> This query keeps the most recent 10 million rows and deletes the
> remaining ones. If I call this once a minute, it would be deleting
> 3000 rows each time. Is there a way to optimize this statement?
> Postgres was setup with default configuration. Is there anything we
> can change in the configuration to make this run more efficiently? The
> table is defined as below:
>
> ...
>
> Is there anything we can do here that can optimize the deletion of rows?
>
>
>
as I previously wrote...
I think you'll find row deletes would kill your performance. For time
aged data like that, we use partitioned tables, we typically do it by
the week (keeping 6 months of history), but you might end up doing it by
N*1000 PK values or some such, so you can use your PK to determine the
partition. With a partitioning scheme, its much faster to add a new
one and drop the oldest at whatever interval you need. See
http://www.postgresql.org/docs/current/static/ddl-partitioning.html
based on the numbers you give above, I think I'd do it by 100000 log_id
values, so you'd end up with 101 partition tables, and every half hour
or so you'd truncate the oldest partition and start a new one (reusing
the previously oldest in a round robin fashion). truncate is 1000s of
times faster than delete.
From | Date | Subject | |
---|---|---|---|
Next Message | Loic d'Anterroches | 2009-10-07 17:52:07 | Re: pg_dump with 1100 schemas being a bit slow |
Previous Message | Dave Huber | 2009-10-07 16:49:31 | Re: automated row deletion |