From: | John Arbash Meinel <john(at)arbash-meinel(dot)com> |
---|---|
To: | SpaceBallOne <space_ball_one(at)hotmail(dot)com> |
Cc: | pgsql-performance(at)postgresql(dot)org |
Subject: | Re: poor performance of db? |
Date: | 2005-01-25 03:15:28 |
Message-ID: | 41F5B9D0.2010803@arbash-meinel.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-performance |
SpaceBallOne wrote:
> Thanks for the reply John,
>
> There are approximately 800 rows total in our job table (which stays
> approximately the same because 'completed' jobs get moved to a
> 'job_archive' table).The other jobs not shown by the specific query
> could be on backorder status, temporary deleted status, etc etc.
>
> You are correct in assuming the _id and _no (stands for 'number')
> fields are unique - this was one of the first pages I built when I
> started learning postgres, so not knowing how to set up primary and
> foriegn keys at the time, I did it that way ... it is normalised to a
> point (probably rather sloppy, but its a juggling act between learning
> on the fly, what I'd like to have, and time constraints of being the
> only I.T. guy in the company!)...
>
> I think I will definitely focus on converting my database and php
> pages to using proper primary keys in postgres - especially if they
> automatically index themselves. I didn't do a vacuum analyse on them
> so that may explain why they didn't seem to do much.
You probably can add them now if you don't want to do a lot of redesign.
ALTER TABLE job ADD PRIMARY KEY (id);
If they are not unique this will cause problems, but as they should be
unique, I think it will work.
I'm not sure how much help indexes will be if you only have 800 rows,
and your queries use 300+ of them.
You might need re-think the query/table design.
You might try doing nested queries, or explicit joins, rather than one
big query with a WHERE clause.
Meaning do stuff like:
SELECT
(job JOIN customer ON job.customer_no = customer.customer_no) as jc
JOIN location on jc.location_no = location.location_no
...
I also see that the planner seems to mis-estimate the number of rows in
some cases. Like here:
> -> Hash (cost=14.53..14.53 rows=853 width=19) (actual
> time=9.79..9.79 rows=0 loops=1)
> -> Seq Scan on street (cost=0.00..14.53 rows=853
> width=19) (actual time=0.01..5.12 rows=853 loops=1)
> -> Hash (cost=9.91..9.91 rows=491 width=20) (actual
> time=5.73..5.73 rows=0 loops=1)
> -> Seq Scan on ubd (cost=0.00..9.91 rows=491 width=20)
> (actual time=0.02..2.98 rows=491
Where it thinks the hash will return all of the rows from the sequential
scan, when in reality it returns none.
I think problems with the planner fall into 3 categories.
1. You didn't VACUUM ANALYZE.
2. You did, but the planner doesn't keep sufficient statistics (ALTER
TABLE job ALTER COLUMN no SET STATISTICS <a number>)
3. You're join needs cross column statistics, which postgres doesn't
support (yet).
If you only have 800 rows, I don't think you have to worry about
statistics, so that leaves things at 1 or 3. If you did do 1, then I
don't know what to tell you.
John
=:->
PS> I'm not a guru at this stuff, so some of what I say may be wrong.
But hopefully I point you in the right direction.
>
> Thanks,
> Dave
> space_ball_one(at)hotmail(dot)com
>
From | Date | Subject | |
---|---|---|---|
Next Message | andrew | 2005-01-25 06:18:11 | Re: poor performance of db? |
Previous Message | Jim C. Nasby | 2005-01-25 03:11:04 | Re: PostgreSQL clustering VS MySQL clustering |