Re: Is it possible to have a "fast-write" Index?

From: deavid <deavidsedice(at)gmail(dot)com>
To: Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Is it possible to have a "fast-write" Index?
Date: 2015-06-18 19:36:58
Message-ID: CAFR-75teFn5Uneq=gTaDrmYhEyUqSVGRosMWfsf7iw1AAJoNEg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

I did another try on BRIN and GIN indexes, and I compared to regular btree
indexes. Now i have 16M rows to do the test.

The numbers seem to be good. Both GIN and BRIN seem good options for
certain tables with more writes than reads (Specially BRIN is very good)

I want to share with you my test; I used real-world data, but i didn't had
time to do something accurate or real-word uses. I know the methodology is
not enough, and maybe some calculations on the spreadsheet are wrong. I
tried to do my best.

I'm using an SSD and I'm trying to compare CPU cost, not I/O.

In short, the results were: (compared to btree)
- INSERT: GIN is 50% faster; BRIN is 6x faster. This is the best scenario.
- UPDATE: each case has a winner; for big updates BRIN is 10x faster and
GIN is 25x faster. For small updates (most real world cases) BTREE is
always the winner; but BRIN gives some good results too.
- DELETE: Almost no difference between the three.
- SELECT: BTREE here is the winner. BRIN is 10% slower, and GIN performance
seems a bit random.

VACUUM, ANALYZE and other tasks are 6x faster with BRIN, 50% faster with
GIN.
Index sizes are 50% smaller with GIN, but with BRIN they are very very
small

Hope you find useful these numbers.

El sáb., 13 jun. 2015 a las 11:41, deavid (<deavidsedice(at)gmail(dot)com>)
escribió:

> Sorry; Because some misconfiugration vacuum and analyze were'nt working.
> Now I'm getting better numbers for BRIN indexes where there are zero rows
> to match.
>
> El sáb., 13 jun. 2015 a las 3:17, deavid (<deavidsedice(at)gmail(dot)com>)
> escribió:
>
>> So I just ran a test case for hash, btree, gin_btree and brin indexes.
>> Also without indexes, and without primary keys.
>> * Testing "deliverynotes" table.
>> - Definition and use case:
>> It is a table contaning real delivery note headers of several years
>> It consists of 300k rows, 128 columns, 63 indexes, 243Mb of data
>> excluding indexes. Since is a table visible for users, almost every
>> column can be searched so we need lots of indexes. We do not need
>> searches to be the fastest possible, we only need to accelerate a
>> bit our user searches; without harming too much writes.
>> - Things to test:
>> - measure index creation times.
>> - measure index space.
>> - with indexes but without primary key
>> - with everything
>> - Create fully, delete everything and Insert again data in blocks
>> - Test updates for recent data
>>
>> I attached the logs for every test, if anyone wants to see what i'm
>> exactly testing.
>> This was tested on my i5 laptop with 4Gb of RAM and a 120Gb SSD (OCZ
>> Agility 3). I'm trying to measure CPU time, not I/O time, so some
>> configurations and tests are specific to avoid as much as IO as I can.
>> I'm using a dev build for Postgresql 9.5 downloaded from git sources.
>>
>> Conclusions:
>> - Gin_btree seems slower in almost every case. It's writes are marginally
>> better than regular btrees even when using work_mem=160MB. (May be 20%
>> faster than btree). They are smaller than I thought.
>> - BRIN indexes seem very fast for writes. For selects maybe is a blend
>> between having indexes and don't having them. They don't recognize that
>> some values are simply out of range of indexed values, and that's a pity.
>> If the values we want are packed together I guess I would get even better
>> results.
>> - Primary keys and uniqueness checks doesn't seem to make any difference
>> here.
>> - Having no indexes at all is faster than I imagined. (Sometimes it beats
>> BRIN or Btree) Maybe because the IO here is faster than usual.
>> - Hash indexes: i tried to do something, but they take too much time to
>> build and i don't know why. If creates are slow, updates should be slow
>> too. I'm not going to test them again.
>>
>> And finally, don't know why but i couldn't vacuum or analyze tables. It
>> always get stalled without doing anything; so i had to comment every
>> vacuum. Maybe there is a bug in this dev version or i misconfigured
>> something.
>>
>> El vie., 12 jun. 2015 a las 7:27, Simon Riggs (<simon(at)2ndquadrant(dot)com>)
>> escribió:
>>
>>> On 5 June 2015 at 18:07, deavid <deavidsedice(at)gmail(dot)com> wrote:
>>>
>>>> There are several use cases where I see useful an index, but adding it
>>>> will slow too much inserts and updates.
>>>> For example, when we have 10 million rows on a table, and it's a table
>>>> which has frequent updates, we need several index to speed up selects, but
>>>> then we'll slow down updates a lot, specially when we have 10 or more
>>>> indexes.
>>>> Other cases involve indexes for text search, which are used only for
>>>> user search and aren't that important, so we want to have them, but we
>>>> don't want the overload they put whenever we write on the table.
>>>> I know different approaches that already solve some of those problems
>>>> in some ways (table partitioning, partial indexes, etc), but i don't feel
>>>> they are the solution to every problem of this kind.
>>>>
>>>> Some people already asked for "delayed write" indexes, but the idea
>>>> gets discarded because the index could get out of sync, so it can omit
>>>> results and this is unacceptable. But i think maybe that could be fixed in
>>>> several ways and we can have a fast and reliable index (but maybe not so
>>>> fast on selects).
>>>>
>>>
>>> This is exactly the use case and mechanism for BRIN indexes.
>>>
>>> --
>>> Simon Riggs http://www.2ndQuadrant.com/
>>> <http://www.2ndquadrant.com/>
>>>
>>> PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
>>>
>>

Attachment Content-Type Size
compare_indexes.ods application/vnd.oasis.opendocument.spreadsheet 49.4 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Jim Nasby 2015-06-18 19:49:57 Re: Is it possible to have a "fast-write" Index?
Previous Message Tom Lane 2015-06-18 19:35:43 Re: Inheritance planner CPU and memory usage change since 9.3.2