Quick Links

Re: Gist indexing performance with cidr types

From:	Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
To:	Henrik Thostrup Jensen <htj(at)nordu(dot)net>
Cc:	Emre Hasegeli <emre(at)hasegeli(dot)com>, pgsql-performance <pgsql-performance(at)postgresql(dot)org>
Subject:	Re: Gist indexing performance with cidr types
Date:	2015-08-26 16:08:01
Message-ID:	CAMkU=1x=puaY9M-xk6=9TZQbWe6EBSMDQFVGNPHGRrM9f_Gc7A@mail.gmail.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-performance

On Wed, Aug 26, 2015 at 4:29 AM, Henrik Thostrup Jensen <htj(at)nordu(dot)net>
wrote:

> On Wed, 26 Aug 2015, Emre Hasegeli wrote:
>
> Are the coverage operatons just that expensive?
>>>
>>
>> They shouldn't be. A similar query like yours works in 0.5 second on my
>> laptop:
>>
> [snip]
>
> I get the same from your testcase.
>
> Maybe, something we haven't expected about your dataset causes a
>> performance regression on the index. Did you see anything relevant on
>> the server logs on index creation time?
>>
>
> I tried dropping and re-creating the index. The only log entry was for the
> drop statement.
>
> The distribution of the data is not uniform like the data set you produce.
> Though I find it hard to believe that it would affect this as much.
>
> select masklen(route), count(*) from routes group by masklen(route);
>

Any chance you can share the actual underlying data? I noticed it wasn't
on github, but is that because it is proprietary, or just because you don't
think it is interesting?

> irrexplorer=> explain analyze select routes.route from routes join hmm on
> routes.route && hmm.route;
> QUERY PLAN
>
> ----------------------------------------------------------------------------------------------------------------------------------------
> Nested Loop (cost=0.41..511914.27 rows=2558 width=7) (actual
> time=8.096..17209.778 rows=8127 loops=1)
> -> Seq Scan on hmm (cost=0.00..11.32 rows=732 width=7) (actual
> time=0.010..0.609 rows=732 loops=1)
> -> Index Only Scan using route_gist on routes (cost=0.41..470.32
> rows=22900 width=7) (actual time=4.823..23.502 rows=11 loops=732)
> Index Cond: (route && (hmm.route)::inet)
> Heap Fetches: 0
> Planning time: 0.971 ms
> Execution time: 17210.627 ms
> (7 rows)
>

If you loop over the 732 rows yourself, issuing the simple query against
each retrieved constant value:

explain (analyze,buffers) select routes.route from routes where route && $1

Does each one take about the same amount of time, or are there some outlier
values which take much more time than the others?

Cheers,

Jeff

In response to

Re: Gist indexing performance with cidr types at 2015-08-26 11:29:01 from Henrik Thostrup Jensen

Responses

Re: Gist indexing performance with cidr types at 2015-08-27 09:35:31 from Henrik Thostrup Jensen

Browse pgsql-performance by date

	From	Date	Subject
Next Message	Tory M Blue	2015-08-26 19:14:04	Index creation running now for 14 hours
Previous Message	Emre Hasegeli	2015-08-26 13:46:07	Re: Gist indexing performance with cidr types