Quick Links

Re: Planner cost adjustments

From:	Francisco Olarte <folarte(at)peoplecall(dot)com>
To:	Daniel Begin <jfd553(at)hotmail(dot)com>
Cc:	Bill Moran <wmoran(at)potentialtech(dot)com>, "pgsql-general(at)postgresql(dot)org" <pgsql-general(at)postgresql(dot)org>, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, Melvin Davidson <melvin6925(at)gmail(dot)com>
Subject:	Re: Planner cost adjustments
Date:	2015-06-11 14:04:50
Message-ID:	CA+bJJbzMgRCKcy7ZPpEYUv2ng6=a83GncirQ8gy=3_BrvgsZgw@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Thread:
Lists:	pgsql-general

Hi Daniel:

On Thu, Jun 11, 2015 at 2:38 PM, Daniel Begin <jfd553(at)hotmail(dot)com> wrote:
.....
> The remaining problem seems related to the statistics of some large tables.
> On one hand, I might increase the statistic target for these tables to 500,
> or even to 1000 and look at the results (but I have doubts it will help). On
> the other hand, I could simply force enable_seqscan to OFF for queries
> dealing with them.
>
> Still not sure about the best solution but the problem is definitely
> narrower :-)

One somehow unrelated point. IIRC your problems where related to
queries doing joins with selected sets of indexed IDs on a smaller
table which then have to be looked up on some very large tables. ( I'm
not able to follow too well which is which, users on changesets, I'm a
bit lost ). Given your runtimes are always high ( in the seconds
range, so it seems wire speed / latencies are not too much of an issue
) and that selectivity estimates on huge tables are always problematic
and may be thwarting your plans you may be able to get faster results
splitting your query.

If I read your plans correctly, that would be selecting your 600
users in one query and then preparing the changeset query for a single
user_id, which should be indexed, and looping it for every user. Given
current machines can easily send-receive 600 queries in a second it
may lead to a simpler solution. This mean you're using the DB as a
somehow inteligent plain old indexed file, but sometimes this is the
simpler approach ( heck, some of my code uses algorithms from the tape
era as they were the faster way I could do it ).

I needed to do this in one of my programs, the optimizer kept
selecting bad plans so I did the first query, held the results in
memory, and then prepared and executed in a loop from the app, my
query was selecting a couple thousand values from submillion table,
and joining with a multimillion one, getting a couple hundreds matches
per original value. Splitting it made the queries on the big table
always run indexed and fast ( and as a side bonus avoided duplicating
the parts of the first record in the wire a hundred times, which was
nice since the short table was wide and I only needed 3 short fields
from the second one, and that made the first query run at wire speed
and the second at disk speed ).

Francisco Olarte.

In response to

Re: Planner cost adjustments at 2015-06-11 12:38:30 from Daniel Begin

Browse pgsql-general by date

	From	Date	Subject
Next Message	Marc Mamin	2015-06-11 14:05:11	Re: select count(*);
Previous Message	asmariyaz23	2015-06-11 14:00:37	Installing Postgres manually GCC error and libmpfr.so.4 not found