From: | Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com> |
---|---|
To: | Kevin Grittner <kgrittn(at)ymail(dot)com>, Simon Riggs <simon(at)2ndQuadrant(dot)com> |
Cc: | PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: PATCH: index-only scans with partial indexes |
Date: | 2015-09-13 21:21:30 |
Message-ID: | 55F5E8DA.8080303@2ndquadrant.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hi,
On 09/13/2015 08:03 PM, Kevin Grittner wrote:
>
> In my view, the most disappointing thing about the patch is that
> when both indexes are present, it doesn't use the narrower one. If
> *only* the narrower index is present, it runs the index-only scan
> using that index for count(b) and count(*), which is faster. Can
> we wrangle this patch into making a better choice among available
> index-only scans?
That's indeed strange, but after poking into that for a while, it seems
rather like a costing issue. Let me demonstrate:
create table t (a int, b int);
insert into t select i,i from generate_series(1,1000000) s(i);
create index idx1 on t (a) where b between 300000 and 600000;
create index idx2 on t (a,b) where b between 300000 and 600000;
vacuum t;
analyze t;
explain select a from t where b between 300000 and 600000;
QUERY PLAN
---------------------------------------------------------------------------
Index Only Scan using idx2 on t (cost=0.42..9236.88 rows=297823 width=4)
Index Cond: ((b >= 300000) AND (b <= 600000))
(2 rows)
drop index idx2;
QUERY PLAN
---------------------------------------------------------------------------
Index Only Scan using idx1 on t (cost=0.42..9236.88 rows=297823 width=4)
(1 row)
Now, both plans are index only scans, but the first one has Index Cond
and the other one does not!
I've put a bunch of logging into cost_index(), and turns out that while
the final cost is *exactly* the same, it's most likely by chance. After
we call amcostestimate, we get these two results:
idx1: indexStartupCost=0.422500 indexTotalCost=4769.537500
indexSelectivity=0.297823 indexCorrelation=1.000000
idx2: indexStartupCost=0.422500 indexTotalCost=6258.652500
indexSelectivity=0.297823 indexCorrelation=0.750000
So amcostestimate does make a difference, and we get
idx1: run_cost = 4769.115000
idx2: run_cost = 6258.230000
and then for both indexes
tuples_fetched=297823.000000
loop_count=1.000000
pages_fetched = 0.000000
but then we do
run_cost += cpu_per_tuple * tuples_fetched;
and we end up with
run_cost = 9236.460000
for both indexes. How's that possible? Number of tuples fetched is
exactly the same for both indexes (297823), so clearly cpu_per_tuple
must be different. That however seems a bit strange, because the only
difference between the indexes is the additional column, and the
condition should be covered by the index predicate ...
It seems that the problem is related to this:
qpquals
= extract_nonindex_conditions(baserel->baserestrictinfo,
path->indexquals);
while the "larger" index on (a,b) gets
path->indexquals=(b BETWEEN 300000 AND 600000)
qpquals=NIL
the "smaller" index on (a) gets
path->indexquals=NIL
qpquals=(b BETWEEN 300000 AND 600000)
And so the larger index gets qpqual_cost=0, the smaller one gets
qpqual_cost=0.005, and so cpu_per_tuple is either 0.01 or 0.015.
Which is exactly the difference between costs from amcostestimate
idx1: 4769.115000 + 0.015 * 297823 = 9236.460000
idx2: 6258.230000 + 0.010 * 297823 = 9236.460000
Sppoky! Although it seems like a mere coincidence, thanks to the nice
round numbers of tuples in the table, and lucky choice of two conditions.
For example after replacing the BETWEEN condition (which is actually two
conditions) with a single one (b<300000) - both in the indexes and
query, I get this:
QUERY PLAN
---------------------------------------------------------------------------
Index Only Scan using idx2 on t (cost=0.42..8541.25 rows=299507 width=4)
Index Cond: (b < 300000)
(2 rows)
drop index idx2;
QUERY PLAN
---------------------------------------------------------------------------
Index Only Scan using idx1 on t (cost=0.42..8541.43 rows=299507 width=4)
(1 row)
The plans are not costed exactly the same anymore (I'm not saying the
costs are correct - clearly still the 'larger' index was preferred).
It's not bound to index only scan either - after adding another column
to the table, and requesting it from the query (so preventing IOS), I
get exactly the same issues.
I really wonder why we get different path->indexquals for those indexes,
because that's the root of the issue here. Any ideas?
>
> It also seems disappointing that we don't recognize that
> count(columnname) could be treated as a synonym for count(*) if
> columnname is NOT NULL, but that doesn't seem like material for
> this patch.
Yeah, that's really not what this patch deals with.
>
> Benchmarking took so much time I did not get to a close review of
> the code changes. :-( Based on just the test results, it appears
> to me that the patch as it stands would be a net win for the vast
> majority of workloads where it would make any noticeable difference,
> and I didn't manage to create any big downside. I would like to
> see whether we can't improve the choice of partial index when there
> are multiple possibilities -- it seems quite surprising to see
> identical estimates for indexes of different column counts and key
> widths, and to see the wider index chosen when the narrow one is
> clearly (and predictably) faster.
>
> I am changing this to Waiting on Author.
>
> I will be on vacation without Internet access for the next 15 days,
> so hopefully someone else can have a look when a new version is
> posted. If it's still open I'll have a look when I get back.
Thanks for the feedback!
regards
--
Tomas Vondra http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
From | Date | Subject | |
---|---|---|---|
Next Message | Dean Rasheed | 2015-09-13 21:38:48 | Re: RLS open items are vague and unactionable |
Previous Message | Ildus Kurbangaliev | 2015-09-13 21:05:28 | Re: [PATCH] Refactoring of LWLock tranches |