From: | Josh Berkus <josh(at)agliodbs(dot)com> |
---|---|
To: | "Ron Mayer" <ron(at)intervideo(dot)com>, "Matthew Nuzum" <cobalt(at)bearfruit(dot)org>, "'Pgsql-Performance'" <pgsql-performance(at)postgresql(dot)org> |
Cc: | <ron(at)intervideo(dot)com> |
Subject: | Re: Caching (was Re: choosing the right platform) |
Date: | 2003-04-11 16:15:15 |
Message-ID: | 200304110915.15060.josh@agliodbs.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-performance |
Ron,
> In a data warehouse we have here, we load about 3 million rows
> each week; load time improved from about 9 to 7 hours
> by breaking up such queries into expressions that only require
> one sort at a time, and surrounding the expressions with
> "set sort_mem=something_big" statements to give it enough
> space to not hit the disk.
>
> SET SORT_MEM=300000;
> CREATE TEMPORARY TABLE potential_new_values AS
> SELECT DISTINCT val FROM import_table;
> ...
> SET SORT_MEM=1000;
>
> Anyone else have similar experience, or am I doing something
> wrong to need so much SORT_MEM?
No, this sounds very reasonable to me. I do a similar operation on one of my
systems as part of a nightly data transformation for reporting. Since I
haven't had to do those on tables over 150,000 rows, I haven't seen the kind
of RAM usage you experience.
> Below is an example of another real-world query from the same
> reporting system that benefits from a sort_mem over 32M.
> Explain analyze (below) shows a 40% improvement by having
> the sort fit in memory.
Cool! That's a perfect example of sizing sort_mem for the query. Mind if I
steal it for an article at some point?
--
Josh Berkus
Aglio Database Solutions
San Francisco
From | Date | Subject | |
---|---|---|---|
Next Message | Ron Mayer | 2003-04-11 17:25:24 | Re: Caching (was Re: choosing the right platform) |
Previous Message | scott.marlowe | 2003-04-11 15:14:00 | Re: Caching (was Re: choosing the right platform) |