Re: Bad plan on a huge table query

From: Daniel Cristian Cruz <danielcristian(at)gmail(dot)com>
To: Alban Hertroys <haramrae(at)gmail(dot)com>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: Bad plan on a huge table query
Date: 2013-03-21 19:30:43
Message-ID: CACffM9Fod56hVNeevqLFTkcVNcaD=Uch+gyRM_jFq_BkQZur+A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

2013/3/21 Alban Hertroys <haramrae(at)gmail(dot)com>

> On Mar 20, 2013, at 22:36, Daniel Cristian Cruz <danielcristian(at)gmail(dot)com>
> wrote:
>
> Hi,
>
> I'm trying to figure out why does the planner found 1 row estimate using
> nested loops over a big table. There is no return from it:
>
> http://explain.depesz.com/s/GRs
>
>
> That plan contains no actual statistics, which makes it difficult to say
> anything about it. And you didn't provide any info on table definitions or
> indexes whatsoever, we have to put that together from the generated query
> plans. Not great...
>

My bad... I guess the plan could do it. And now I figured out that I lost
the first query... Now the query looks like this:

> It returns if disable nested loops, but the plan still poor:
>
> http://explain.depesz.com/s/fMY
>
>
> You could probably gain some here by adding an index on
> aluno_mensal.id_medicao. In step 14 the lack thereof causes a seqscan over
> more than a million rows.
>

There is already an index on id_medicao. It used a hashjoin because I
disable mergejoin which uses the index, instead there is no return.

> What I also find a bit peculiar is that the filter in step 7 appears to
> apply a function (date_part(text, date)) on every row in that heap. Do you
> perhaps have a functional index on that table that makes that operation
> efficient?
>

Yes, tried to improve performance creating a index on inicio using
CAST(inicio TO DATE). The real filter here is aula_confirmacao.inicio::date
BETWEEN DATE_TRUNC('YEAR', CURRENT_TIMESTAMP) AND CURRENT_TIMESTAMP
(filtering all rows from year's beginning until now).

> Besides, now() results in a timestamp, which in this query needs to be
> converted to date; it's perhaps better to use CURRENT_DATE there, although
> the benefits are probably immeasurable since it only needs to be calculated
> once for all rows it's compared against.
>

DATE_TRUNC expect a text and a timestamp.

>
> I'm using PostgreSQL 9.2.3, default_statistics_target on 1000.
>
> I can't remember what to make PostgreSQL sees a better estimate in the
> scan of aula_confirmacao and the join with presenca. I got rusty after a
> long time just doing modeling.
>
> Does someone has some idea on that?
>
>
> Are you asking about vacuum? You're definitely not vacuuming enough, your
> statistics and your actual numbers of rows differ by enormous amounts
> (that's why depesz colours them red).
>

autovacuum is running on production and the develop database. This is
happening at develop database, fresh restore.

> Are you using autovacuum? If so, you probably need to tune it more
> aggressively. For the short term, running an ANALYSE on those tables should
> at least get you more accurate query plans.
>

I've done it; with default_statistics_target on 1000, 100 and 200 (left it
on 200, which was production config too).

Thank you and sorry about the broken english, there was a long time since
the last time I wrote...

--
Daniel Cristian Cruz
クルズ クリスチアン ダニエル

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Daniel Cristian Cruz 2013-03-21 19:37:43 Re: Bad plan on a huge table query
Previous Message Jeff Janes 2013-03-21 19:30:05 Re: Bad plan on a huge table query