| From: | "Luke Lonergan" <llonergan(at)greenplum(dot)com> |
|---|---|
| To: | "Ulrich Wisser" <ulrich(dot)wisser(at)relevanttraffic(dot)se>, pgsql-performance(at)postgresql(dot)org |
| Cc: | "Nicholas E(dot) Wakefield" <nwakefield(at)KineticNetworks(dot)com>, "Barry Klawans" <bklawans(at)jaspersoft(dot)com>, "Daria Hutchinson" <dhutchinson(at)greenplum(dot)com> |
| Subject: | Re: Need for speed 3 |
| Date: | 2005-09-01 16:37:53 |
| Message-ID: | BF3C7C71.EBC5%llonergan@greenplum.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-performance |
Ulrich,
On 9/1/05 6:25 AM, "Ulrich Wisser" <ulrich(dot)wisser(at)relevanttraffic(dot)se> wrote:
> My application basically imports Apache log files into a Postgres
> database. Every row in the log file gets imported in one of three (raw
> data) tables. My columns are exactly as in the log file. The import is
> run approx. every five minutes. We import about two million rows a month.
Bizgres Clickstream does this job using an ETL (extract transform and load)
process to transform the weblogs into an optimized schema for reporting.
> After every import the data from the current day is deleted from the
> reporting table and recalculated from the raw data table.
This is something the optimized ETL in Bizgres Clickstream also does well.
> What do you think of this approach? Are there better ways to do it? Is
> there some literature you recommend reading?
I recommend the Bizgres Clickstream docs, you can get it from Bizgres CVS,
and there will shortly be a live html link on the website.
Bizgres is free - it also improves COPY performance by almost 2x, among
other enhancements.
- Luke
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Nicholas E. Wakefield | 2005-09-01 17:30:43 | Re: Need for speed 3 |
| Previous Message | Merlin Moncure | 2005-09-01 15:28:33 | Re: Need for speed 3 |