From: | David Garamond <lists(at)zara(dot)6(dot)isreserved(dot)com> |
---|---|
To: | Greg Stark <gsstark(at)mit(dot)edu> |
Cc: | pgsql-general(at)postgresql(dot)org |
Subject: | Re: how many record versions |
Date: | 2004-05-24 06:03:54 |
Message-ID: | 40B1904A.3090307@zara.6.isreserved.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
Greg Stark wrote:
>>Actually, each record will be incremented probably only thousands of times a
>>day. But there are many banners. Each record has a (bannerid, campaignid,
>>websiteid, date, countrycode) "dimensions" and (impression, click) "measures".
>
> In the past when I had a very similar situation we kept the raw impression and
> click event data. Ie, one record per impression in the impression table and
> one record per click in the click data.
> That makes the tables insert-only which is efficient and not prone to locking
> contention. They would never have to be vacuumed except after purging old data.
Assuming there are 10 millions of impressions per day, the impression
table will grow at least 200-400MB per day, is that correct? What do you
do and how often do you purge old data? Do you do a mass DELETE on the
impression table itself or do you switch to another table? I've found
that deleting tens/hundreds of thousands of row, at least in InnoDB,
takes long, long time (plus it sucks CPU and slows other queries).
> Then to accelerate queries we had denormalized aggregate tables with a cron
> job that did the equivalent of
>
> insert into agg_clicks (
> select count(*),bannerid
> from clicks
> where date between ? and ?
> group by bannerid
> )
--
dave
From | Date | Subject | |
---|---|---|---|
Next Message | Marc Slemko | 2004-05-24 06:23:28 | Re: how many record versions |
Previous Message | David Garamond | 2004-05-24 05:58:28 | Re: how many record versions |