Re: BRIN indexes

From: Felipe Santos <felipepts(at)gmail(dot)com>
To: Igor Neyman <ineyman(at)perceptron(dot)com>
Cc: "Joshua D(dot) Drake" <jd(at)commandprompt(dot)com>, Melvin Davidson <melvin6925(at)gmail(dot)com>, David Rowley <david(dot)rowley(at)2ndquadrant(dot)com>, "pgsql-general(at)postgresql(dot)org" <pgsql-general(at)postgresql(dot)org>, Thomas Kellerer <spam_eater(at)gmx(dot)net>
Subject: Re: BRIN indexes
Date: 2016-01-29 10:40:10
Message-ID: CAPYcRiXJTf8Zk+vXdk5fNFjCNm_=9A6uHvGmGU6=F0QG5MTq0A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

2016-01-28 16:33 GMT-02:00 Igor Neyman <ineyman(at)perceptron(dot)com>:

>
>
> *From:* pgsql-general-owner(at)postgresql(dot)org [mailto:
> pgsql-general-owner(at)postgresql(dot)org] *On Behalf Of *Felipe Santos
> *Sent:* Thursday, January 28, 2016 1:17 PM
> *To:* Joshua D. Drake <jd(at)commandprompt(dot)com>
> *Cc:* Melvin Davidson <melvin6925(at)gmail(dot)com>; David Rowley <
> david(dot)rowley(at)2ndquadrant(dot)com>; pgsql-general(at)postgresql(dot)org; Thomas
> Kellerer <spam_eater(at)gmx(dot)net>
> *Subject:* Re: [GENERAL] BRIN indexes
>
>
>
> "Further to the point, it is self defeating to have more than one BRIN
> index on the table if the columns involved would have mutually
> non-adjacent pages."
>
>
>
> Not really, if both columns are ordered, BRIN will work
>
>
>
> "Therefore, it actually would be good to state that in the documentation,
> even it were just a comment."
>
>
>
> It is = "BRIN is designed for handling very large tables in which
> certain columns have some natural correlation with their physical location
> within the table"
>
> Link: http://www.postgresql.org/docs/devel/static/brin-intro.html
>
>
>
>
>
> Also, I did some tests and here are the results I got:
>
>
>
> Query with no index = completion time 43s
>
> Same Query with BRIN = completion time 14s / index size 0,5 MB
>
> Same Query without BRIN and with BTREE = completion time 10s / index size
> 5.000,00 MB
>
>
>
> As you can see, BRIN can save 99% of disk space for just a slightly worse
> performance.
>
>
>
> It seems like a huge improvement, given that your data fits BRIN's use
> case.
>
>
>
> Felipe,
>
>
>
> What kind of queries you used in your test?
>
> Where they based on clustering columns?
>
>
>
> Regards
>
> Igor Neyman
>

Hello Igor,

I took the sample BRIN test from the new release's wiki and added the BTREE
test:
https://wiki.postgresql.org/wiki/What%27s_new_in_PostgreSQL_9.5#BRIN_Indexes

The results today may vary from the reported above but are still in the
same levels of performance gain:

brin_db=# CREATE TABLE orders (
brin_db(# id int,
brin_db(# order_date timestamptz,
brin_db(# item text);
CREATE TABLE

brin_db=# INSERT INTO orders (order_date, item)
brin_db-# SELECT x, 'dfiojdso'
brin_db-# FROM generate_series('2000-01-01 00:00:00'::timestamptz,
'2015-03-01 00:00:00'::timestamptz,'2 seconds'::interval) a(x);
INSERT 0 239243401

brin_db=# \dt+ orders
List of relations
Schema | Name | Type | Owner | Size | Description
--------+--------+-------+----------+-------+-------------
public | orders | table | postgres | 12 GB |
(1 row)

brin_db=# EXPLAIN ANALYSE SELECT count(*) FROM orders WHERE order_date
BETWEEN '2012-01-04 09:00:00' and '2014-01-04 14:30:00';

QUERY PLAN

------------------------------------------------------------------------------------------------------------------------------------------------------
-------
Aggregate (cost=4108912.01..4108912.02 rows=1 width=0) (actual
time=81116.722..81116.722 rows=1 loops=1)
-> Seq Scan on orders (cost=0.00..4106759.58 rows=860972 width=0)
(actual time=60173.531..78566.113 rows=31589101 loops=1)
Filter: ((order_date >= '2012-01-04 09:00:00-02'::timestamp with
time zone) AND (order_date <= '2014-01-04 14:30:00-02'::timestamp with time
zone))
Rows Removed by Filter: 207654300
Planning time: 0.443 ms
Execution time: 81118.168 ms
(6 rows)

brin_db=# CREATE INDEX idx_order_date_brin
brin_db-# ON orders
brin_db-# USING BRIN (order_date);
CREATE INDEX

brin_db=# \di+ idx_order_date_brin
List of relations
Schema | Name | Type | Owner | Table | Size |
Description
--------+---------------------+-------+----------+--------+--------+-------------
public | idx_order_date_brin | index | postgres | orders | 432 kB |
(1 row)

brin_db=# EXPLAIN ANALYSE SELECT count(*) FROM orders WHERE order_date
BETWEEN '2012-01-04 09:00:00' and '2014-01-04 14:30:00';

QUERY PLAN

------------------------------------------------------------------------------------------------------------------------------------------------------
-----------------
Aggregate (cost=2408269.34..2408269.35 rows=1 width=0) (actual
time=14164.923..14164.923 rows=1 loops=1)
-> Bitmap Heap Scan on orders (cost=326808.28..2328609.76
rows=31863832 width=0) (actual time=42.742..11430.562 rows=31589101 loops=1)
Recheck Cond: ((order_date >= '2012-01-04 09:00:00-02'::timestamp
with time zone) AND (order_date <= '2014-01-04 14:30:00-02'::timestamp with
time zone))
Rows Removed by Index Recheck: 21907
Heap Blocks: lossy=201344
-> Bitmap Index Scan on idx_order_date_brin
(cost=0.00..318842.32 rows=31863832 width=0) (actual time=36.151..36.151
rows=2013440 loops=1)
Index Cond: ((order_date >= '2012-01-04
09:00:00-02'::timestamp with time zone) AND (order_date <= '2014-01-04
14:30:00-02'::timestamp
with time zone))
Planning time: 0.297 ms
Execution time: 14164.985 ms
(9 rows)

brin_db=# drop index idx_order_date_brin ;
DROP INDEX

brin_db=# create index idx_order_date_btree on orders(order_date);
CREATE INDEX

brin_db=# \di+ idx_order_date_btree
List of relations
Schema | Name | Type | Owner | Table | Size |
Description
--------+----------------------+-------+----------+--------+---------+-------------
public | idx_order_date_btree | index | postgres | orders | 5125 MB |
(1 row)

brin_db=# EXPLAIN ANALYSE SELECT count(*) FROM orders WHERE order_date
BETWEEN '2012-01-04 09:00:00' and '2014-01-04 14:30:00';

QUERY PLAN

------------------------------------------------------------------------------------------------------------------------------------------------------
-----------
Aggregate (cost=1269366.79..1269366.80 rows=1 width=0) (actual
time=10435.148..10435.148 rows=1 loops=1)
-> Index Only Scan using idx_order_date_btree on orders
(cost=0.57..1189707.21 rows=31863832 width=0) (actual time=0.656..7919.754
rows=31589101
loops=1)
Index Cond: ((order_date >= '2012-01-04 09:00:00-02'::timestamp
with time zone) AND (order_date <= '2014-01-04 14:30:00-02'::timestamp with
t
ime zone))
Heap Fetches: 31589101
Planning time: 6.285 ms
Execution time: 10435.197 ms
(6 rows)

Att.,

Felipe

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message John R Pierce 2016-01-29 10:47:39 Re: BDR replication
Previous Message Nikhil 2016-01-29 10:27:10 BDR replication