From: | Robert Haas <robertmhaas(at)gmail(dot)com> |
---|---|
To: | Jaime Casanova <jcasanov(at)systemguards(dot)com(dot)ec> |
Cc: | psql performance list <pgsql-performance(at)postgresql(dot)org> |
Subject: | Re: limiting results makes the query slower |
Date: | 2009-08-23 21:41:39 |
Message-ID: | 603c8f070908231441n19440d94vef2e32e28e514076@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-performance |
On Thu, Aug 20, 2009 at 9:50 PM, Jaime
Casanova<jcasanov(at)systemguards(dot)com(dot)ec> wrote:
> in a web app we have a query that we want to show in limited results
> at a time, this one executes in 10 seconds if i use limit but executes
> in 300ms if i remove it.
> why is that happening? the query is using and index for avoiding the
> sort so the nestloop should go only for the first 20 records on
> tgen_persona, no?
> below some more info
>
> postgresql 8.3.7
> ram 32GB
> shared_buffers 8GB
> work_mem 8MB
>
> tgen_persona has 185732 records and tcom_invitacion is a partitioned
> (by dates: 1 month every partition) table and has more than 29million
> records in the partitions
>
> explain analyze here: http://explain.depesz.com/s/B4
>
> the situation improves if i disable nestloops, explain analyze with
> nestloop off here: http://explain.depesz.com/s/Jv
>
> select Per.razon_social as MAIL,inv.cata_esta_calificacion,
> inv.observa_calificacion,
> to_char(inv.fech_crea,'YYYY:MM:DD') as fech_crea,
> case when (( select cod_estado FROM TPRO_PROVEEDOR
> WHERE id_proveedor = (select max(a.id_proveedor)
> from tpro_proveedor a
> where persona_id = Inv.persona_id )
> )='Habilitado')
> then 'Habilitado'
> else 'Deshabilitado'
> end as empresa_id
> from tgen_persona Per, tcom_invitacion Inv
> where Per.persona_id = Inv.persona_id
> and inv.id_soli_compra = '60505'
> ORDER BY Per.razon_social asc limit 20 offset 0
This is pretty common. Tom Lane pointed out in a message I don't
feel like searching for right now that LIMIT tends to magnify the
effect of bad selectivity estimates. In this case, the join
selectivity is off by more than 3 orders of magnitude right here:
Nested Loop (cost=0.00..4280260.77 rows=8675 width=588) (actual
time=4835.934..11335.731 rows=2 loops=1)
I'm not familiar with how we estimate join selectivity in this case,
but it's obviously giving really, really wrong answers. The problem
may be here:
Append (cost=0.00..22.00 rows=23 width=560) (actual time=0.055..0.055
rows=0 loops=185732)
It appears that we're estimating 23 rows because we have 23
partitions, and we're estimating one row for each. There are a lot of
places in the planner where we round off to an integer with a floor of
1, which may be part of the problem here... but I don't know without
looking at the code.
...Robert
From | Date | Subject | |
---|---|---|---|
Next Message | Jeff Janes | 2009-08-23 22:26:16 | Re: [PERFORMANCE] how to set wal_buffers |
Previous Message | Jaime Casanova | 2009-08-23 20:25:59 | Re: [PERFORMANCE] how to set wal_buffers |