From: | Francisco Olarte <folarte(at)peoplecall(dot)com> |
---|---|
To: | Daniel Begin <jfd553(at)hotmail(dot)com> |
Cc: | Bill Moran <wmoran(at)potentialtech(dot)com>, "pgsql-general(at)postgresql(dot)org" <pgsql-general(at)postgresql(dot)org>, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, Melvin Davidson <melvin6925(at)gmail(dot)com> |
Subject: | Re: Planner cost adjustments |
Date: | 2015-06-11 14:04:50 |
Message-ID: | CA+bJJbzMgRCKcy7ZPpEYUv2ng6=a83GncirQ8gy=3_BrvgsZgw@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
Hi Daniel:
On Thu, Jun 11, 2015 at 2:38 PM, Daniel Begin <jfd553(at)hotmail(dot)com> wrote:
.....
> The remaining problem seems related to the statistics of some large tables.
> On one hand, I might increase the statistic target for these tables to 500,
> or even to 1000 and look at the results (but I have doubts it will help). On
> the other hand, I could simply force enable_seqscan to OFF for queries
> dealing with them.
>
> Still not sure about the best solution but the problem is definitely
> narrower :-)
One somehow unrelated point. IIRC your problems where related to
queries doing joins with selected sets of indexed IDs on a smaller
table which then have to be looked up on some very large tables. ( I'm
not able to follow too well which is which, users on changesets, I'm a
bit lost ). Given your runtimes are always high ( in the seconds
range, so it seems wire speed / latencies are not too much of an issue
) and that selectivity estimates on huge tables are always problematic
and may be thwarting your plans you may be able to get faster results
splitting your query.
If I read your plans correctly, that would be selecting your 600
users in one query and then preparing the changeset query for a single
user_id, which should be indexed, and looping it for every user. Given
current machines can easily send-receive 600 queries in a second it
may lead to a simpler solution. This mean you're using the DB as a
somehow inteligent plain old indexed file, but sometimes this is the
simpler approach ( heck, some of my code uses algorithms from the tape
era as they were the faster way I could do it ).
I needed to do this in one of my programs, the optimizer kept
selecting bad plans so I did the first query, held the results in
memory, and then prepared and executed in a loop from the app, my
query was selecting a couple thousand values from submillion table,
and joining with a multimillion one, getting a couple hundreds matches
per original value. Splitting it made the queries on the big table
always run indexed and fast ( and as a side bonus avoided duplicating
the parts of the first record in the wire a hundred times, which was
nice since the short table was wide and I only needed 3 short fields
from the second one, and that made the first query run at wire speed
and the second at disk speed ).
Francisco Olarte.
From | Date | Subject | |
---|---|---|---|
Next Message | Marc Mamin | 2015-06-11 14:05:11 | Re: select count(*); |
Previous Message | asmariyaz23 | 2015-06-11 14:00:37 | Installing Postgres manually GCC error and libmpfr.so.4 not found |