From: | Robert(dot)Farrugia(at)go(dot)com(dot)mt |
---|---|
To: | pgsql-admin(at)postgresql(dot)org |
Subject: | After VACUUM, statistics become skewed |
Date: | 2003-05-21 15:00:21 |
Message-ID: | 20030521150018.0CBB1924EC2@developer.postgresql.org |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-admin |
I have noticed the following problem on various large tables for certain
queries.
Given a table around 4GB in size containing millions of records, before
vacuuming the following query used the correct index
(mo_200302_called_idx) which is built on answertime and callednumber_type.
Query is as follows:
select answertime::date, count(*), sum(callduration) from
mobileorig_200302 where answertime between '2003/2/3 00:00:00' and
'2003/2/14 23:59:59' and normal(dialleddigits_value) = '50043992' and
callednumber_type in ('P', 'M') group by answertime::date
After doing a database wide vacuum full analyze, the above query is
starting to use an incorrect index i.e. the callingnumber_type (which uses
answertime and callingnumber_type fields). This eventually slows down the
system since the query takes more time to finish.
NOTICE: QUERY PLAN:
Aggregate (cost=9218923.08..9218935.56 rows=166 width=12)
-> Group (cost=9218923.08..9218927.24 rows=1664 width=12)
-> Sort (cost=9218923.08..9218923.08 rows=1664 width=12)
-> Index Scan using mo_200302_calling_idx on
mobileorig_200302 (cost=0.00..9218834.06 rows=1664 width=12)
EXPLAIN
I managed to track down the problem to this. If I used only one
callednumber_type, i.e. the query becomes
select answertime::date, count(*), sum(callduration) from
mobileorig_200302 where answertime between '2003/2/3 00:00:00' and
'2003/2/14 23:59:59' and normal(dialleddigits_value) = '50043992' and callednumber_type in ('M') group by answertime::date
the query planner uses the correct index, while using more than one type,
it skews up.
NOTICE: QUERY PLAN:
Aggregate (cost=137870.25..137871.57 rows=18 width=12)
-> Group (cost=137870.25..137870.69 rows=177 width=12)
-> Sort (cost=137870.25..137870.25 rows=177 width=12)
-> Index Scan using mo_200302_called_idx on
mobileorig_200302 (cost=0.00..137863.66 rows=177 width=12)
EXPLAIN
Anyone else encountered something similar ? Any ideas on what is happening
and if is solvable ? Usually by dropping the table and reloading it from
disk may solve the problem (it may take more than one try), but this is
becoming unpractical due to the amount of data to restore each time. I'll
try re-indexing the tables and see if something happens. Unfortunately
this also happens on tables which have not been changed for a very long
time and have also been re-indexed so I'm not confident on this.
The system runs on postgres 7.2.3.
Regards
Robert
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2003-05-21 16:51:06 | Re: After VACUUM, statistics become skewed |
Previous Message | Tom Lane | 2003-05-21 14:25:42 | Re: Logfile removal |