From: | Jeff Amiel <becauseimjeff(at)yahoo(dot)com> |
---|---|
To: | pgsql-general(at)postgresql(dot)org |
Subject: | planner, newly added records and most common values |
Date: | 2012-01-20 01:36:26 |
Message-ID: | 1327023386.54510.YahooMailClassic@web65509.mail.ac4.yahoo.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
Ive got a scenario where I've got a 2 million row table. Data from inbound files gets processed into it.
A new file might have 10 resulting rows in this table...might have 40K...depends on the source, day of month, etc.
I've got a process that parses the file and loads the records into the table...giving it a unique file_id for the overall load and places that value on each record.
Another process will perform a series of queries...joining against that table (for only records with that file_id).
The problem is that the planner has no idea how many records might exist for that file_id.
If I throw a file_id at the planner that is not in the most common value list, it picks a nice number like 384 as it's row count estimate.
So when I am referencing a new file_id (that obviously isn't IN the most common value list as yet..regardless of how many
records I just loaded because I haven't run analyze yet), the planner dutifully estimates that I will get only 384 rows.
For large files, this is off by 2 (or god forbid, 3) orders of magnitude.
That yields very bad overall plans (regardless of the fact that I have indexes on the file_id column)
It seems like I am in a no-win situation. The query I am executing is fairly complex...and when the planner is off by multiple orders of magnitude on a rowcount, it goes way off the tracks in terms of planning.
I COULD do an analyze after loading the file...but there is no guarantee that the file I just loaded will end up in the most common value list....and I end up with bad plan.
Any thoughts?
From | Date | Subject | |
---|---|---|---|
Next Message | Andrew Sullivan | 2012-01-20 02:40:19 | Re: planner, newly added records and most common values |
Previous Message | David Johnston | 2012-01-19 22:56:00 | Re: How to improve my slow query for table have list of child table? |