From: | "Scott Marlowe" <scott(dot)marlowe(at)gmail(dot)com> |
---|---|
To: | "Bruno Rodrigues Siqueira" <bruno(at)ravnus(dot)com> |
Cc: | pgsql-performance(at)postgresql(dot)org |
Subject: | Re: select on 1milion register = 6s |
Date: | 2007-07-29 04:51:41 |
Message-ID: | dcc563d10707282151m7a2de11ex5f593cfc19b98095@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-performance |
On 7/28/07, Bruno Rodrigues Siqueira <bruno(at)ravnus(dot)com> wrote:
>
> Ok.
> QUERY PLAN
> Sort (cost=11449.37..11449.40 rows=119 width=8) (actual
> time=14431.537..14431.538 rows=2 loops=1)
> Sort Key: to_char(data_encerramento, 'yyyy-mm'::text)
> -> HashAggregate (cost=11448.79..11448.96 rows=119 width=8) (actual
> time=14431.521..14431.523 rows=2 loops=1)
> -> Index Scan using detalhamento_bas_idx3003 on detalhamento_bas
> (cost=0.00..11442.95 rows=11679 width=8) (actual time=0.135..12719.155
> rows=2335819 loops=1)
See the row mismatch there? It expects about 11k rows, gets back 2.3
million. That's a pretty big misestimate. Have you run analyze
recently on this table?
Is there a reason you're doing this:
to_char( data_encerramento ,'yyyy-mm')
between '2006-12' and '2007-01'
when you should be able to just do:
data_encerramento between '2006-12-01' and '2007-01-31'
? that should be able to use good estimates from analyze. My guess
is the planner is making a bad guess because of the way you're
handling the dates.
> SERVER
> DELL PowerEdge 2950
> XEON Quad-Core 3.0Ghz
> 4Gb RAM
> Linux CentOS 5.0 64-bits
> Postgres 8.1.4
> Postgresql.conf
> # - Memory -
>
> shared_buffers = 50000 # min 16 or max_connections*2, 8KB
400 Meg is kind of low for a server with 4 G ram. 25% is more
reasonable (i.e. 125000 buffers)
> work_mem = 3145728 # min 64, size in KB
> maintenance_work_mem = 4194304 # min 1024, size in KB
Whoa nellie! thats ~ 3 Gig of work mem, and 4 gig of maintenance work
mem. In a machine with 4 gig ram, that's a recipe for disaster.
Something more reasonable would be 128000 (~125Meg) for each since
you've limited your machine to 10 connections you should be ok.
setting work_mem too high can run your machine out of memory and into
a swap storm that will kill performance.
> fsync = off # turns forced synchronization on or
> off
So, the data in this database isn't important? Cause that's what
fsync = off says to me. Better to buy yourself a nice battery backed
caching RAID controller than turn off fsync.
> effective_cache_size = 41943040 # typically 8KB each
And you're machine has 343,604,830,208 bytes of memory available for
caching? Seems a little high to me.
> random_page_cost = 1 # units are one sequential page
> fetch
Seldom if ever is it a good idea to bonk the planner on the head with
random_page_cost=1. setting it to 1.2 ot 1.4 is low enough, but 1.4
to 2.0 is more realistic.
> stats_start_collector = off
> #stats_command_string = off
> #stats_block_level = off
> #stats_row_level = off
> #stats_reset_on_server_start = off
I think you need stats_row_level on for autovacuum, but I'm not 100% sure.
Let us know what happens after fixing these settings and running
analyze and running explain analyze, with possible changes to the
query.
From | Date | Subject | |
---|---|---|---|
Next Message | Jay Kang | 2007-07-29 14:22:30 | Questions on Tags table schema |
Previous Message | Bruno Rodrigues Siqueira | 2007-07-28 23:07:18 | RES: select on 1milion register = 6s |