From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | Rhaoni Chiu Pereira <rhaoni(at)sistemica(dot)info> |
Cc: | PostgreSQL ADMIN <pgsql-admin(at)postgresql(dot)org> |
Subject: | Re: How can I make PosgreSQL use an Index ? |
Date: | 2003-09-30 15:08:56 |
Message-ID: | 19517.1064934536@sss.pgh.pa.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-admin |
Rhaoni Chiu Pereira <rhaoni(at)sistemica(dot)info> writes:
> [ very large join plan ]
Sorry for not responding sooner --- somehow this got overlooked in my
inbox. It looks to me like the big problem is that you have all these
nested-loop joins:
> -> Nested Loop (cost=100025960.94..100027780.65 rows=1
> width=818) (actual time=3476.87..99606.77 rows=19923 loops=1)
> Join Filter: (("outer".emp = "inner".emp) AND ("outer".fil
> = "inner".fil) AND ("outer".codigo_supervisor = "inner".codigo_supervisor) AND
> ("outer".codigo_grupo_rep = "inner".codigo_grupo_rep))
> -> Nested Loop (cost=100025960.94..100027775.22 rows=1
> width=765) (actual time=3476.74..97802.69 rows=19923 loops=1)
> Join Filter: (("inner".ano_mes)::text = to_char
> ("outer".data_emissao, 'YYYYMM'::text))
> -> Nested Loop (cost=25960.94..27762.92 rows=1
> width=755) (actual time=3475.14..32090.12 rows=19923 loops=1)
> Join Filter: (("inner".emp = "outer".emp) AND
> ("outer".fil = "inner".fil) AND ("outer".codigo = "inner".cla_categoria) AND
> ("outer".codigo = "inner".cla_marca) AND ("outer".codigo = "inner".situacao))
> -> Nested Loop (cost=25960.94..27705.22
> rows=10 width=665) (actual time=3474.12..17734.21 rows=199230 loops=1)
> Join Filter: (("outer".emp
> = "inner".emp) AND ("inner".fil = "outer".fil))
> -> Nested Loop
> (cost=25960.94..27699.30 rows=1 width=638) (actual time=3474.02..6030.09
> rows=19923 loops=1)
> Join Filter: (("inner".emp
> = "outer".emp) AND ("inner".empfil = "outer".fil))
The reason the planner is choosing nestloop here is that it thinks there
are only a few rows involved (notice the estimated row counts are all "1"
or "10"). Nestloop is a fine join plan for small numbers of rows, but
it pretty well sucks for tens of thousands of rows which is what you've
actually got. I am not sure why the row-count estimates are so far off,
but it could be because the planner is unaware of cross-column
correlations in your data. Are the multiple join conditions actually
necessary, or are some of them redundant?
A quick and dirty thing you could try to see if the plan can be improved
is to set "enable_nestloop" off (do NOT set "enable_seqscan" off).
In the long run you want the planner to do better without such a
brute-force hack, though. A simple answer is to boost the statistics
target on the join columns and re-analyze, but that may not help much
if the real issue is cross-column correlations. A more invasive
solution is to reconsider your data design. It looks like you have a
lot of multi-column join keys --- can you find ways to combine those
into single columns? (As an example, I'd never build a table containing
separate date and time columns rather than a single timestamp column.)
I realize that this'd probably be a pain in the neck, but if you can do
it, it would simplify your queries as well as help the planner produce
better plans.
regards, tom lane
From | Date | Subject | |
---|---|---|---|
Next Message | Anna Kanevsky | 2003-09-30 15:28:33 | Delete accident |
Previous Message | lgama@sagitario.cic.ipn.mx | 2003-09-29 19:59:27 | remove my email account, please |