From: | "Jason Coene" <jcoene(at)gotfrag(dot)com> |
---|---|
To: | "'Mr Pink'" <mr_pink_is_the_only_pro(at)yahoo(dot)com>, "'Scott Kirkwood'" <scottakirkwood(at)gmail(dot)com> |
Cc: | <pgsql-performance(at)postgresql(dot)org> |
Subject: | Re: Caching of Queries |
Date: | 2004-09-23 16:53:25 |
Message-ID: | 200409231653.i8NGrUaX017580@ms-smtp-02.nyroc.rr.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-performance |
I'm not an expert, but I've been hunting down a killer performance problem
for a while now. It seems this may be the cause.
At peak load, our database slows to a trickle. The CPU and disk utilization
are normal - 20-30% used CPU and disk performance good.
All of our "postgres" processes end up in the "semwai" state - seemingly
waiting on other queries to complete. If the system isn't taxed in CPU or
disk, I have a good feeling that this may be the cause. I didn't know that
planning queries could create such a gridlock, but based on Mr Pink's
explanation, it sounds like a very real possibility.
We're running on SELECT's, and the number of locks on our "high traffic"
tables grows to the hundreds. If it's not the SELECT locking (and we don't
get that many INSERT/UPDATE on these tables), could the planner be doing it?
At peak load (~ 1000 queries/sec on highest traffic table, all very
similar), the serialized queries pile up and essentially create a DoS on our
service - requiring a restart of the PG daemon. Upon stop & start, it's
back to normal.
I've looked at PREPARE, but apparently it only lasts per-session - that's
worthless in our case (web based service, one connection per data-requiring
connection).
Does this sound plausible? Is there an alternative way to do this that I
don't know about? Additionally, in our case, I personally don't see any
downside to caching and using the same query plan when the only thing
substituted are variables. In fact, I'd imagine it would help performance
significantly in high-volume web applications.
Thanks,
Jason
> -----Original Message-----
> From: pgsql-performance-owner(at)postgresql(dot)org [mailto:pgsql-performance-
> owner(at)postgresql(dot)org] On Behalf Of Mr Pink
> Sent: Thursday, September 23, 2004 11:29 AM
> To: Scott Kirkwood; pgsql-performance(at)postgresql(dot)org
> Subject: Re: [PERFORM] Caching of Queries
>
> Not knowing anything about the internals of pg, I don't know how this
> relates, but in theory,
> query plan caching is not just about saving time re-planning queries, it's
> about scalability.
> Optimizing queries requires shared locks on the database metadata, which,
> as I understand it
> causes contention and serialization, which kills scalability.
>
> I read this thread from last to first, and I'm not sure if I missed
> something, but if pg isnt
> caching plans, then I would say plan caching should be a top priority for
> future enhancements. It
> needn't be complex either: if the SQL string is the same, and none of the
> tables involved in the
> query have changed (in structure), then re-use the cached plan. Basically,
> DDL and updated
> statistics would have to invalidate plans for affected tables.
>
> Preferably, it should work equally for prepared statements and those not
> pre-prepared. If you're
> not using prepare (and bind variables) though, your plan caching down the
> drain anyway...
>
> I don't think that re-optimizing based on values of bind variables is
> needed. It seems like it
> could actually be counter-productive and difficult to asses it's impact.
>
> That's the way I see it anyway.
>
> :)
>
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2004-09-23 17:05:32 | Re: Caching of Queries |
Previous Message | Patrick Hatcher | 2004-09-23 16:32:50 | Re: vacuum full & max_fsm_pages question |