Re: slow join not using index properly

From: Ilya Kosmodemiansky <ilya(dot)kosmodemiansky(at)postgresql-consulting(dot)com>
To: Stefan Amshey <sramshey(at)gmail(dot)com>
Cc: "pgsql-performance(at)postgresql(dot)org" <pgsql-performance(at)postgresql(dot)org>
Subject: Re: slow join not using index properly
Date: 2014-03-25 07:05:51
Message-ID: CAG95seWbz8y3aqUTwSA-ZNDdcr7vJEy=t6JiynqKODO0jsxSHA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Hi Stefan,

stupid me - Ive missed some

with RECURSIVE qq(cont_key, anc_key) AS
(

SELECT
a1.context_key, ancestor_key
FROM
virtual_ancestors a1
UNION select ( -- here, in the union
SELECT
a1.context_key, a1.ancestor_key
FROM
virtual_ancestors a1
WHERE
a1.context_key > cont_key
ORDER BY
a1.context_key LIMIT 1
)
from qq where cont_key is not null -- and here
)
SELECT
distinct a.cont_key
FROM
qq a, collection_data, virtual_ancestors a2
WHERE
a.cont_key IS NOT NULL
AND a.anc_key = collection_data.context_key
AND collection_data.collection_
>
> context_key = a2.context_key
> AND a2.ancestor_key = 1072173;

sorry for disorientating

On Mon, Mar 24, 2014 at 7:40 PM, Stefan Amshey <sramshey(at)gmail(dot)com> wrote:
> Hi Ilya-
>
> Thanks so much for taking a stab at optimizing that query. I had to fiddle
> a bit with your proposed version in order to get it function. Here's what I
> came up with in the end:
>>
>> with RECURSIVE qq(cont_key, anc_key) AS
>> (
>>
>> SELECT
>> a1.context_key, ancestor_key
>> FROM
>> virtual_ancestors a1
>> UNION (
>> SELECT
>> a1.context_key, a1.ancestor_key
>> FROM
>> virtual_ancestors a1, qq
>> WHERE
>> context_key > qq.cont_key
>> ORDER BY
>> context_key LIMIT 1
>> )
>> )
>> SELECT
>> distinct a.cont_key
>> FROM
>> qq a, collection_data, virtual_ancestors a2
>> WHERE
>> a.cont_key IS NOT NULL
>> AND a.anc_key = collection_data.context_key
>> AND collection_data.collection_context_key = a2.context_key
>> AND a2.ancestor_key = 1072173;
>
>
> I had to drop the MIN( a1.context_key ) and LIMIT 1 from your version off of
> the first select statement in order to avoid syntax issues or other errors.
> The version above does produce the same counts as the original, but in the
> end it wasn't really a win for us. Here's the plan it produced:
>
>> HashAggregate (cost=707724.36..707726.36 rows=200 width=4) (actual
>> time=27638.844..27639.706 rows=3522 loops=1)
>> Buffers: shared hit=79323, temp read=49378 written=47557
>> CTE qq
>> -> Recursive Union (cost=0.00..398869.78 rows=10814203 width=8)
>> (actual time=0.018..20196.397 rows=10821685 loops=1)
>> Buffers: shared hit=74449, temp read=49378 written=23779
>> -> Seq Scan on virtual_ancestors a1 (cost=0.00..182584.93
>> rows=10814193 width=8) (actual time=0.010..2585.411 rows=10821685 loops=1)
>> Buffers: shared hit=74443
>> -> Limit (cost=0.00..0.08 rows=1 width=8) (actual
>> time=7973.297..7973.298 rows=1 loops=1)
>> Buffers: shared hit=6, temp read=49378 written=1
>> -> Nested Loop (cost=0.00..30881281719119.79
>> rows=389822567470830 width=8) (actual time=7973.296..7973.296 rows=1
>> loops=1)
>> Join Filter: (a1.context_key > qq.cont_key)
>> Rows Removed by Join Filter: 22470607
>> Buffers: shared hit=6, temp read=49378 written=1
>> -> Index Scan using virtual_context_key_idx on
>> virtual_ancestors a1 (cost=0.00..18206859.46 rows=10814193 width=8) (actual
>> time=0.018..0.036 rows=3 loops=1)
>> Buffers: shared hit=6
>> -> WorkTable Scan on qq (cost=0.00..2162838.60
>> rows=108141930 width=4) (actual time=0.008..1375.445 rows=7490203 loops=3)
>> Buffers: temp read=49378 written=1
>> -> Hash Join (cost=25283.37..308847.31 rows=2905 width=4) (actual
>> time=449.167..27629.759 rows=13152 loops=1)
>> Hash Cond: (a.anc_key = collection_data.context_key)
>> Buffers: shared hit=79323, temp read=49378 written=47557
>> -> CTE Scan on qq a (cost=0.00..216284.06 rows=10760132
>> width=8) (actual time=0.021..25265.179 rows=10821685 loops=1)
>> Filter: (cont_key IS NOT NULL)
>> Buffers: shared hit=74449, temp read=49378 written=47557
>> -> Hash (cost=25282.14..25282.14 rows=98 width=4) (actual
>> time=373.836..373.836 rows=2109 loops=1)
>> Buckets: 1024 Batches: 1 Memory Usage: 75kB
>> Buffers: shared hit=4874
>> -> Hash Join (cost=17557.15..25282.14 rows=98 width=4)
>> (actual time=368.374..373.013 rows=2109 loops=1)
>>
>> Hash Cond: (a2.context_key =
>> collection_data.collection_context_key)
>> Buffers: shared hit=4874
>> -> Index Only Scan using virtual_ancestors_pkey on
>> virtual_ancestors a2 (cost=0.00..238.57 rows=272 width=4) (actual
>> time=0.029..1.989 rows=1976 loops=1)
>>
>> Index Cond: (ancestor_key = 1072173)
>> Heap Fetches: 917
>> Buffers: shared hit=883
>> -> Hash (cost=10020.40..10020.40 rows=602940
>> width=8) (actual time=368.057..368.057 rows=603066 loops=1)
>> Buckets: 65536 Batches: 1 Memory Usage:
>> 23558kB
>> Buffers: shared hit=3991
>> -> Seq Scan on collection_data
>> (cost=0.00..10020.40 rows=602940 width=8) (actual time=0.006..146.447
>> rows=603066 loops=1)
>> Buffers: shared hit=3991
>> Total runtime: 27854.200 ms
>
>
> I also tried including the MIN( a1.context_key ) in the first select
> statement as you had written it, but upon doing that it became necessary to
> add a GROUP BY clause, and doing that changed the final number of rows
> selected:
>>
>> ERROR: column "a1.ancestor_key" must appear in the GROUP BY clause or be
>> used in an aggregate function
>> LINE 4: min( a1.context_key ), ancestor_key
>
> ^
>
> Including the LIMIT 1 at the end of the first select statement gave a syntax
> error that I couldn't seem to get past, so I think it might be invalid:
>>
>> ERROR: syntax error at or near "UNION"
>> LINE 8: UNION (
>> ^
>
>
> So I landed on the version that I posted above, which seems to select the
> same set in all of the cases that I tried.
>
> Anyway, thanks again for taking a stab at helping, I do appreciate it. If
> you have any other ideas that might be of help I'd certainly be happy to
> hear them.
>
> Take care,
> /Stefan
>
>
>
>
> On Thu, Mar 20, 2014 at 11:02 PM, Ilya Kosmodemiansky
> <ilya(dot)kosmodemiansky(at)postgresql-consulting(dot)com> wrote:
>>
>> Hi Stefan!
>>
>> Probably you need to rewrite your query like this (check it first):
>>
>> with RECURSIVE qq(cont_key, anc_key) as
>> (
>> select min(a1.context_key), ancestor_key from virtual_ancestors a1
>> union select
>> (SELECT
>> a1.context_key, ancestor_key
>> FROM
>> virtual_ancestors a1 where context_key > cont_key order by
>> context_key limit 1) from qq where cont_key is not null
>> )
>> select a1.cont_key
>> from qq a1, collection_data, virtual_ancestors a2
>> WHERE
>> a1.anc_key = collection_data.context_key
>> AND collection_data.collection_context_key = a2.context_key
>> AND a2.ancestor_key = ?
>>
>> best regards,
>> Ilya
>>
>> On Fri, Mar 21, 2014 at 12:56 AM, Stefan Amshey <sramshey(at)gmail(dot)com>
>> wrote:
>> > We have a slow performing query that we are trying to improve, and it
>> > appears to be performing a sequential scan at a point where it should be
>> > utilizing an index. Can anyone tell me why postgres is opting to do it
>> > this
>> > way?
>> >
>> > The original query is as follows:
>> >
>> > SELECT DISTINCT
>> > a1.context_key
>> > FROM
>> > virtual_ancestors a1, collection_data, virtual_ancestors a2
>> > WHERE
>> > a1.ancestor_key = collection_data.context_key
>> > AND collection_data.collection_context_key = a2.context_key
>> > AND a2.ancestor_key = ?
>> >
>> > The key relationships should all using indexed columns, but the query
>> > plan
>> > that postgres comes up with ends up performing a sequential scan on the
>> > collection_data table (in this case about 602k rows) where we would have
>> > expected it to utilize the index:
>> >
>> > HashAggregate (cost=60905.73..60935.73 rows=3000 width=4) (actual
>> > time=3366.165..3367.354 rows=3492 loops=1)
>> > Buffers: shared hit=16291 read=1222
>> > -> Nested Loop (cost=17546.26..60898.23 rows=3000 width=4) (actual
>> > time=438.332..3357.918 rows=13037 loops=1)
>> > Buffers: shared hit=16291 read=1222
>> > -> Hash Join (cost=17546.26..25100.94 rows=98 width=4)
>> > (actual
>> > time=408.554..415.767 rows=2092 loops=1)
>> > Hash Cond: (a2.context_key =
>> > collection_data.collection_context_key)
>> > Buffers: shared hit=4850 read=3
>> > -> Index Only Scan using virtual_ancestors_pkey on
>> > virtual_ancestors a2 (cost=0.00..233.32 rows=270 width=4) (actual
>> > time=8.532..10.703 rows=1960 loops=1)
>> > Index Cond: (ancestor_key = 1072173)
>> > Heap Fetches: 896
>> > Buffers: shared hit=859 read=3
>> > -> Hash (cost=10015.56..10015.56 rows=602456 width=8)
>> > (actual time=399.708..399.708 rows=602570 loops=1)
>> > Buckets: 65536 Batches: 1 Memory Usage: 23538kB
>> > Buffers: shared hit=3991
>> > ######## sequential scan occurs here ##########
>> > -> Seq Scan on collection_data
>> > (cost=0.00..10015.56
>> > rows=602456 width=8) (actual time=0.013..163.509 rows=602570 loops=1)
>> > Buffers: shared hit=3991
>> > -> Index Only Scan using virtual_ancestors_pkey on
>> > virtual_ancestors a1 (cost=0.00..360.70 rows=458 width=8) (actual
>> > time=1.339..1.403 rows=6 loops=2092)
>> > Index Cond: (ancestor_key = collection_data.context_key)
>> > Heap Fetches: 7067
>> > Buffers: shared hit=11441 read=1219
>> > Total runtime: 3373.058 ms
>> >
>> >
>> > The table definitions are as follows:
>> >
>> > Table "public.virtual_ancestors"
>> > Column | Type | Modifiers
>> > --------------+----------+-----------
>> > ancestor_key | integer | not null
>> > context_key | integer | not null
>> > degree | smallint | not null
>> > Indexes:
>> > "virtual_ancestors_pkey" PRIMARY KEY, btree (ancestor_key,
>> > context_key)
>> > "virtual_context_key_idx" btree (context_key)
>> > Foreign-key constraints:
>> > "virtual_ancestors_ancestor_key_fkey" FOREIGN KEY (ancestor_key)
>> > REFERENCES contexts(context_key)
>> > "virtual_ancestors_context_key_fkey" FOREIGN KEY (context_key)
>> > REFERENCES contexts(context_key)
>> >
>> > Table "public.collection_data"
>> > Column | Type | Modifiers
>> > ------------------------+----------------------+-----------
>> > collection_context_key | integer | not null
>> > context_key | integer | not null
>> > type | character varying(1) | not null
>> > source | character varying(1) | not null
>> > Indexes:
>> > "collection_data_context_key_idx" btree (context_key)
>> > "collection_data_context_key_index" btree (collection_context_key)
>> > CLUSTER
>> > Foreign-key constraints:
>> > "collection_data_collection_context_key_fkey" FOREIGN KEY
>> > (collection_context_key) REFERENCES contexts(context_key) ON DELETE
>> > CASCADE
>> > "collection_data_context_key_fkey" FOREIGN KEY (context_key)
>> > REFERENCES
>> > contexts(context_key) ON DELETE CASCADE
>> >
>> > Can anyone suggest a way that we can get postgres to use the
>> > collection_data_context_key_index properly? I thought that it might be
>> > related to the fact that collection_data_context_key_index is a
>> > CLUSTERED
>> > index, but we did some basic experimentation that seems to indicate
>> > otherwise, i.e. the bad plan persists despite re-clustering the index.
>> >
>> > We are using PostgreSQL 9.2.5 on x86_64-unknown-linux-gnu, compiled by
>> > gcc
>> > (Debian 4.4.5-8) 4.4.5, 64-bit
>> >
>> > Interestingly, on an instance running PostgreSQL 9.2.4 on
>> > x86_64-unknown-linux-gnu, compiled by gcc (Debian 4.4.5-8) 4.4.5, 64-bit
>> > where I copied the 2 tables over to a temporary database, the plan comes
>> > out
>> > differently:
>> >
>> > HashAggregate (cost=39692.03..39739.98 rows=4795 width=4) (actual
>> > time=73.285..75.141 rows=3486 loops=1)
>> > Buffers: shared hit=22458
>> > -> Nested Loop (cost=0.00..39680.05 rows=4795 width=4) (actual
>> > time=0.077..63.116 rows=13007 loops=1)
>> > Buffers: shared hit=22458
>> > -> Nested Loop (cost=0.00..32823.38 rows=164 width=4) (actual
>> > time=0.056..17.685 rows=2084 loops=1)
>> > Buffers: shared hit=7529
>> > -> Index Only Scan using virtual_ancestors_pkey on
>> > virtual_ancestors a2 (cost=0.00..1220.85 rows=396 width=4) (actual
>> > time=0.025..2.732 rows=1954 loops=1)
>> > Index Cond: (ancestor_key = 1072173)
>> > Heap Fetches: 1954
>> > Buffers: shared hit=1397
>> > ######## Note the index scan here - this is what it SHOULD be doing
>> > ##############
>> > -> Index Scan using collection_data_context_key_index on
>> > collection_data (cost=0.00..79.24 rows=56 width=8) (actual
>> > time=0.004..0.005 rows=1 loops=1954)
>> > Index Cond: (collection_context_key =
>> > a2.context_key)
>> > Buffers: shared hit=6132
>> > -> Index Only Scan using virtual_ancestors_pkey on
>> > virtual_ancestors a1 (cost=0.00..35.40 rows=641 width=8) (actual
>> > time=0.007..0.015 rows=6 loops=2084)
>> > Index Cond: (ancestor_key = collection_data.context_key)
>> > Heap Fetches: 13007
>> > Buffers: shared hit=14929
>> > Total runtime: 76.431 ms
>> >
>> > Why can't I get the Postgres 9.2.5 instance to use the optimal plan?
>> >
>> > Thanks in advance!
>> > /Stefan
>> >
>> > --
>> > -
>> > Stefan Amshey
>>
>>
>>
>> --
>> Ilya Kosmodemiansky,
>>
>> PostgreSQL-Consulting.com
>> tel. +14084142500
>> cell. +4915144336040
>> ik(at)postgresql-consulting(dot)com
>
>
>
>
> --
> -
> Stefan Amshey

--
Ilya Kosmodemiansky,

PostgreSQL-Consulting.com
tel. +14084142500
cell. +4915144336040
ik(at)postgresql-consulting(dot)com

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message Ilya Kosmodemiansky 2014-03-25 07:11:32 Re: pg_dump vs pg_basebackup
Previous Message gianfranco caca 2014-03-25 04:45:09 pg_dump vs pg_basebackup