Re: clustered index benchmark comparing Postgresql vs Mariadb

From: Merlin Moncure <mmoncure(at)gmail(dot)com>
To: 유상지 <y0212(at)naver(dot)com>
Cc: PostgreSQL General <pgsql-general(at)postgresql(dot)org>
Subject: Re: clustered index benchmark comparing Postgresql vs Mariadb
Date: 2017-08-31 15:29:28
Message-ID: CAHyXU0zeziyYOMtgra3VH-P_m8NR+gAfW8LvQiM8cFxGE-BOOQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Wed, Aug 30, 2017 at 9:03 PM, 유상지 <y0212(at)naver(dot)com> wrote:

> I want to get help with Postgresql.
>
> I investigated that Postgresql could be rather fast in an environment
> using a secondary index. but It came up with different results on benckmark.
>
> The database I compared was mariadb, and the benchmark tool was sysbench
> 1.0.8 with the postgresql driver.
>
> Server environment: vmware, Ubuntu 17.04, processor: 4, RAM: 4 GB,
> Harddisk: 40 GB, Mariadb (v10.3), PostgreSQL (v9.6.4)
>
mysql and other systems (for example sql server) use a technique where the
table is automatically clustered around an index-- generally the primary
key. This technique has some tradeoffs; the main upside is that lookups on
the pkey are somewhat faster whereas lookups on any other index are
somewhat slower and insertions can be slower in certain cases (especially
if guids are the pkey). I would call this technique 'index organized
table'. The technique exploits the persistent organization so that the
index is implied and does not have to be kept separate from the heap (the
main table data storage).

postgres 'cluster' command currently is a one time pass over the table that
organizes the table physically in index order but does not maintain the
table in that order nor does it exploit the ordering to eliminate the
primary key index in the manner that other systems do. From a postgres
point of view, the main advantage is that scans (not single record lookups)
over the key will be sequential physical reads and will tend to have to
read less physical pages since adjacent key records logically will also be
adjacent physically.

For my part, I generally prefer the postgres style of organization for most
workloads, particularly for the surrogate key pattern. I would definitely
like to have the option of having the indexed organized style however.
It's definitely possible to tease out the tradeoffs in synthetic
benchmarking but in the gross aggregate I suspect (but can't obviously
prove) the technique is a loser since as database models mature the kinds
of ways tables are indexed looked up and joined tends to proliferate.

merlin

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Melvin Davidson 2017-08-31 15:46:24 Re: Table create time
Previous Message Melvin Davidson 2017-08-31 15:27:38 Re: Table create time