Re: Unexpected expensive index scan

From: Jake Nielsen <jake(dot)k(dot)nielsen(at)gmail(dot)com>
To: pgsql-performance(at)postgresql(dot)org
Subject: Re: Unexpected expensive index scan
Date: 2016-09-28 02:45:54
Message-ID: CAP3LSG5ejP9Ezm_wOcVV5C=Fs5hHHctF6xNGS1=fqLjEjU7yYA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On Tue, Sep 27, 2016 at 6:24 PM, Jake Nielsen <jake(dot)k(dot)nielsen(at)gmail(dot)com>
wrote:

>
>
> On Tue, Sep 27, 2016 at 6:03 PM, Jake Nielsen <jake(dot)k(dot)nielsen(at)gmail(dot)com>
> wrote:
>
>>
>> On Tue, Sep 27, 2016 at 5:41 PM, Mike Sofen <msofen(at)runbox(dot)com> wrote:
>>
>>> *From:* Jake Nielsen *Sent:* Tuesday, September 27, 2016 5:22 PM
>>>
>>>
>>> the query
>>>
>>> SELECT * FROM SyncerEvent WHERE ID > 12468 AND propogatorId NOT IN
>>> ('"d8130ab9!-66d0!-4f13!-acec!-a9556362f0ad"') AND conflicted != 1 AND
>>> userId = '57dc984f1c87461c0967e228' ORDER BY ID LIMIT 4000;^
>>>
>>>
>>>
>>> On Tue, Sep 27, 2016 at 5:02 PM, Jake Nielsen <jake(dot)k(dot)nielsen(at)gmail(dot)com>
>>> wrote:
>>>
>>> I've got a query that takes a surprisingly long time to run, and I'm
>>> having a really rough time trying to figure it out.
>>>
>>>
>>>
>>> Before I get started, here are the specifics of the situation:
>>>
>>>
>>>
>>> Here is the table that I'm working with (apologies for spammy indices,
>>> I've been throwing shit at the wall)
>>>
>>> Table "public.syncerevent"
>>>
>>> Column | Type | Modifiers
>>>
>>>
>>> --------------+---------+-----------------------------------
>>> -----------------------
>>>
>>> id | bigint | not null default nextval('syncerevent_id_seq'::
>>> regclass)
>>>
>>> userid | text |
>>>
>>> event | text |
>>>
>>> eventid | text |
>>>
>>> originatorid | text |
>>>
>>> propogatorid | text |
>>>
>>> kwargs | text |
>>>
>>> conflicted | integer |
>>>
>>> Indexes:
>>>
>>> "syncerevent_pkey" PRIMARY KEY, btree (id)
>>>
>>> "syncereventidindex" UNIQUE, btree (eventid)
>>>
>>> "anothersyncereventidindex" btree (userid)
>>>
>>> "anothersyncereventidindexwithascending" btree (userid, id)
>>>
>>> "asdfasdgasdf" btree (userid, id DESC)
>>>
>>> "syncereventuseridhashindex" hash (userid)
>>>
>>>
>>>
>>> To provide some context, as per the wiki,
>>>
>>> there are 3,290,600 rows in this table.
>>>
>>> It gets added to frequently, but never deleted from.
>>>
>>> The "kwargs" column often contains mid-size JSON strings (roughly 30K
>>> characters on average)
>>>
>>> As of right now, the table has 53 users in it. About 20% of those have a
>>> negligible number of events, but the rest of the users have a fairly even
>>> smattering.
>>>
>>>
>>>
>>> EXPLAIN (ANALYZE, BUFFERS) says:
>>>
>>>
>>> QUERY PLAN
>>>
>>>
>>> ------------------------------------------------------------
>>> ------------------------------------------------------------
>>> --------------------------------------
>>>
>>> Limit (cost=0.43..1218.57 rows=4000 width=615) (actual
>>> time=3352.390..3403.572 rows=4000 loops=1) Buffers: shared hit=120244
>>> read=160198
>>>
>>> -> Index Scan using syncerevent_pkey on syncerevent
>>> (cost=0.43..388147.29 rows=1274560 width=615) (actual
>>> time=3352.386..3383.100 rows=4000 loops=1)
>>>
>>> Index Cond: (id > 12468)
>>>
>>> Filter: ((propogatorid <> '"d8130ab9!-66d0!-4f13!-acec!-a9556362f0ad"'::text)
>>> AND (conflicted <> 1) AND (userid = '57dc984f1c87461c0967e228'::text))
>>>
>>> Rows Removed by Filter: 1685801
>>>
>>> Buffers: shared hit=120244 read=160198
>>>
>>> Planning time: 0.833 ms
>>>
>>> Execution time: 3407.633 ms
>>>
>>> (9 rows)
>>>
>>> If it matters/interests you, here is my underlying confusion:
>>>
>>> From some internet sleuthing, I've decided that having a table per user
>>> (which would totally make this problem a non-issue) isn't a great idea.
>>> Because there is a file per table, having a table per user would not scale.
>>> My next thought was partial indexes (which would also totally help), but
>>> since there is also a table per index, this really doesn't side-step the
>>> problem. My rough mental model says: If there exists a way that a
>>> table-per-user scheme would make this more efficient, then there should
>>> also exist an index that could achieve the same effect (or close enough to
>>> not matter). I would think that "userid = '57dc984f1c87461c0967e228'" could
>>> utilize at least one of the two indexes on the userId column, but clearly
>>> I'm not understanding something.
>>>
>>> Any help in making this query more efficient would be greatly
>>> appreciated, and any conceptual insights would be extra awesome.
>>>
>>> Thanks for reading.
>>>
>>> -Jake
>>>
>>> ----------------------
>>>
>>>
>>>
>>> This stands out: WHERE ID > 12468 AND propogatorId NOT IN
>>> ('"d8130ab9!-66d0!-4f13!-acec!-a9556362f0ad"')
>>>
>>> As does this from the analyze: Rows Removed by Filter: 1685801
>>>
>>>
>>>
>>> The propogaterid is practically the only column NOT indexed and it’s
>>> used in a “not in”. It looks like it’s having to do a table scan for all
>>> the rows above the id cutoff to see if any meet the filter requirement.
>>> “not in” can be very expensive. An index might help on this column. Have
>>> you tried that?
>>>
>>>
>>>
>>> Your rowcounts aren’t high enough to require partitioning or any other
>>> changes to your table that I can see right now.
>>>
>>>
>>>
>>> Mike Sofen (Synthetic Genomics)
>>>
>>>
>>>
>>
>> Thanks Mike, that's true, I hadn't thought of non-indexed columns forcing
>> a scan. Unfortunately, just to test this out, I tried pulling out the more
>> suspect parts of the query, and it still seems to want to do an index scan:
>>
>>
>> EXPLAIN (ANALYZE, BUFFERS) SELECT * FROM SyncerEvent WHERE userId =
>> '57dc984f1c87461c0967e228' ORDER BY ID LIMIT 4000;
>>
>>
>> QUERY PLAN
>>
>>
>> ------------------------------------------------------------
>> ------------------------------------------------------------
>> ----------------------------------
>>
>> Limit (cost=0.43..1140.62 rows=4000 width=615) (actual
>> time=2706.365..2732.308 rows=4000 loops=1)
>>
>> Buffers: shared hit=120239 read=161924
>>
>> -> Index Scan using syncerevent_pkey on syncerevent
>> (cost=0.43..364982.77 rows=1280431 width=615) (actual
>> time=2706.360..2715.514 rows=4000 loops=1)
>>
>> Filter: (userid = '57dc984f1c87461c0967e228'::text)
>>
>> Rows Removed by Filter: 1698269
>>
>> Buffers: shared hit=120239 read=161924
>>
>> Planning time: 0.131 ms
>>
>> Execution time: 2748.526 ms
>>
>> (8 rows)
>>
>> It definitely looks to me like it's starting at the ID = 12468 row, and
>> just grinding up the rows. The filter is (unsurprisingly) false for most of
>> the rows, so it ends up having to chew up half the table before it actually
>> finds 4000 rows that match.
>>
>> After creating a partial index using that userId, things go way faster.
>> This is more-or-less what I assumed I'd get by making having that
>> multi-column index of (userId, Id), but alas:
>>
>> remoteSyncerLogistics=> CREATE INDEX sillyIndex ON syncerevent (ID) where
>> userId = '57dc984f1c87461c0967e228';
>>
>> CREATE INDEX
>>
>> remoteSyncerLogistics=> EXPLAIN (ANALYZE, BUFFERS) SELECT * FROM
>> SyncerEvent WHERe userId = '57dc984f1c87461c0967e228' ORDER BY ID LIMIT
>> 4000;
>>
>> QUERY
>> PLAN
>>
>> ------------------------------------------------------------
>> ------------------------------------------------------------
>> ----------------------
>>
>> Limit (cost=0.43..443.21 rows=4000 width=615) (actual
>> time=0.074..13.349 rows=4000 loops=1)
>>
>> Buffers: shared hit=842 read=13
>>
>> -> Index Scan using sillyindex on syncerevent (cost=0.43..141748.41
>> rows=1280506 width=615) (actual time=0.071..5.372 rows=4000 loops=1)
>>
>> Buffers: shared hit=842 read=13
>>
>> Planning time: 0.245 ms
>>
>> Execution time: 25.404 ms
>>
>> (6 rows)
>>
>>
>> remoteSyncerLogistics=> EXPLAIN (ANALYZE, BUFFERS) SELECT * FROM
>> SyncerEvent WHERe userId = '57dc984f1c87461c0967e228' AND ID > 12468 ORDER
>> BY ID LIMIT 4000;
>>
>> QUERY
>> PLAN
>>
>> ------------------------------------------------------------
>> ------------------------------------------------------------
>> ----------------------
>>
>> Limit (cost=0.43..453.34 rows=4000 width=615) (actual
>> time=0.023..13.244 rows=4000 loops=1)
>>
>> Buffers: shared hit=855
>>
>> -> Index Scan using sillyindex on syncerevent (cost=0.43..144420.43
>> rows=1275492 width=615) (actual time=0.020..5.392 rows=4000 loops=1)
>>
>> Index Cond: (id > 12468)
>>
>> Buffers: shared hit=855
>>
>> Planning time: 0.253 ms
>>
>> Execution time: 29.371 ms
>>
>> (7 rows)
>>
>>
>> Any thoughts?
>>
>> -Jake
>>
>
> Hmmm, here's another unexpected piece of information:
>
>
> EXPLAIN (ANALYZE, BUFFERS) SELECT * FROM SyncerEvent WHERE userid =
> '57dc984f1c87461c0967e228';
>
> QUERY PLAN
>
>
> ------------------------------------------------------------
> ---------------------------------------------------------------
>
> Seq Scan on syncerevent (cost=0.00..311251.51 rows=1248302 width=618)
> (actual time=0.008..4970.507 rows=1259137 loops=1)
>
> Filter: (userid = '57dc984f1c87461c0967e228'::text)
>
> Rows Removed by Filter: 2032685
>
> Buffers: shared hit=108601 read=161523
>
> Planning time: 0.092 ms
>
> Execution time: 7662.845 ms
>
> (6 rows)
>
>
> Looks like even just using userid='blah' doesn't actually result in the
> index being used, despite the fact that there are indexes on the userId
> column:
>
>
> "syncerevent_pkey" PRIMARY KEY, btree (id)
>
> "syncereventidindex" UNIQUE, btree (eventid)
>
> "anothersyncereventidindex" btree (userid)
>
> "anothersyncereventidindexwithascending" btree (userid, id)
>
> "asdfasdgasdf" btree (userid, id DESC)
>
> "syncereventuseridhashindex" hash (userid)
>
>
> -Jake
>

So... it seems that setting the userId to one that has less rows in the
table results in the index actually being used...

EXPLAIN (ANALYZE, BUFFERS) SELECT * FROM SyncerEvent WHERe userId =
'57d35db7353b0d627c0e592f' AND ID > 12468 ORDER BY ID LIMIT 4000;

QUERY PLAN

-------------------------------------------------------------------------------------------------------------------------------------------------------------------------

Limit (cost=0.56..8574.30 rows=4000 width=618) (actual time=0.031..13.190
rows=4000 loops=1)

Buffers: shared hit=867

-> Index Scan using anothersyncereventidindexwithascending on
syncerevent (cost=0.56..216680.62 rows=101090 width=618) (actual
time=0.027..5.313 rows=4000 loops=1)

Index Cond: ((userid = '57d35db7353b0d627c0e592f'::text) AND (id >
12468))

Buffers: shared hit=867

Planning time: 0.168 ms

Execution time: 29.331 ms

(7 rows)

Is there some way to force the use of one of the indexes on the userId
column?

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Jov 2016-09-28 04:06:07 Re: PostgreSQL on ZFS: performance tuning
Previous Message Jake Nielsen 2016-09-28 01:24:00 Re: Unexpected expensive index scan