From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | "Strange, John W" <john(dot)w(dot)strange(at)jpmorgan(dot)com> |
Cc: | "pgsql-performance(at)postgresql(dot)org" <pgsql-performance(at)postgresql(dot)org> |
Subject: | Re: Trying to understand Stats/Query planner issue |
Date: | 2011-11-13 16:17:22 |
Message-ID: | 6469.1321201042@sss.pgh.pa.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-performance |
"Strange, John W" <john(dot)w(dot)strange(at)jpmorgan(dot)com> writes:
> I have a question on how the analyzer works in this type of scenario.
> We calculate some results and COPY INTO some partitioned tables, which we use some selects to aggregate the data back out into reports. Everyone once in a while the aggregation step picks a bad plan due to stats on the tables that were just populated. Updating the stats and rerunning the query seems to solve the problem, this only happens if we enable nested loop query plans.
Well, even if auto-analyze launches instantly after you commit the
insertions (which it won't), it's going to take time to scan the table
and then commit the updates to pg_statistic. So there is always going
to be some window where queries will get planned with obsolete
information. If you're inserting enough data to materially change the
statistics of a table, and you need to query that table right away,
doing a manual ANALYZE rather than waiting for auto-analyze is
recommended.
> The other option is just to analyze each table involved in the query after the insert, but that seems a bit counterproductive.
Why would you think that? This type of scenario is exactly why ANALYZE
isn't deprecated as a user command.
regards, tom lane
From | Date | Subject | |
---|---|---|---|
Next Message | Venkat Balaji | 2011-11-14 05:54:48 | Re: : bg_writer overloaded ? |
Previous Message | Pavel Stehule | 2011-11-13 08:57:31 | Re: Heavy contgnous load |