Re: MongoDB 3.2 beating Postgres 9.5.1?

From: Sameer Kumar <sameer(dot)kumar(at)ashnik(dot)com>
To: Paul Jones <pbj(at)cmicdo(dot)com>, pgsql-general(at)postgresql(dot)org
Subject: Re: MongoDB 3.2 beating Postgres 9.5.1?
Date: 2016-07-19 02:23:03
Message-ID: CADp-Sm7ws0GTUQPUZtubWEBgw_VrZZVph26sGZckr_jOSR+EOA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Fri, 11 Mar 2016, 9:39 p.m. Paul Jones, <pbj(at)cmicdo(dot)com> wrote:

> I have been running the EDB benchmark that compares Postgres and MongoDB.
> I believe EDB ran it against PG 9.4 and Mongo 2.6. I am running it
> against PG 9.5.1 and Mongo 3.2 with WiredTiger storage using 10,000,000
> JSON records generated by the benchmark. It looks like Mongo is winning,
> and apparently because of its cache management.
>
> The first queries on both run in ~30 min. And, once PG fills its cache,
> it whips Mongo on repeats of the *same* query (vmstat shows no disk
> reads for PG).
>
> However, when different query on the same table is issued to both,
> vmstat shows that PG has to read the *entire* table again, and it takes
> ~30 min. Mongo does a lot of reads initially but after about 5 minutes,
> it stops reading and completes the query, most likely because it is
> using its cache very effectively.
>
> Host: Virtual Machine
> 4 CPUs
> 16 Gb RAM
> 200 Gb Disk
> RHEL 6.6
>
> PG: 9.5.1 compiled from source
> shared_buffers = 7GB
> effectve_cache_size = 12GB
>
> Mongo: 3.2 installed with RPM from Mongo
>
> In PG, I created the table by:
>
> CREATE TABLE json_tables
> (
> data JSONB
> );
>
> After loading, it creates the index:
>
> CREATE INDEX json_tables_idx ON json_tables USING GIN (data
> jsonb_path_ops);
>

This would create one GIN index which is going to be a bit larger than
usual btree /n-tree index on a specific JSON field. And would be slower
too. I suggest that you create an index on the specific expression using
JSON operators. In my opinion that index would be much more nearer to
mongoDB indexes.

> After a lot of experimentation, I discovered that the benchmark was not
> using PG's index, so I modified the four queries to be:
>
> SELECT data FROM json_tables WHERE data @> '{"brand": "ACME"}';
> SELECT data FROM json_tables WHERE data @> '{"name": "Phone Service Basic
> Plan"}';
> SELECT data FROM json_tables WHERE data @> '{"name": "AC3 Case Red"}';
> SELECT data FROM json_tables WHERE data @> '{"type": "service"}';
>
> Here are two consecutive explain analyze for PG, for the same query.
> No functional difference in the plans that I can tell, but the effect
> of PG's cache on the second is dramatic.
>
> If anyone has ideas on how I can get PG to more effectively use the cache
> for subsequent queries, I would love to hear them.
>
> -------
>
> benchmark=# explain analyze select data from json_tables where data @>
> '{"name": "AC3 Case Red"}';
>
> QUERY PLAN
>
>
>
> -----------------------------------------------------------------------------------------------------------------------------------------
>
> Bitmap Heap Scan on json_tables (cost=113.50..37914.64 rows=10000
> width=1261)
> (actual time=2157.118..1259550.327 rows=909091 loops=1)
> Recheck Cond: (data @> '{"name": "AC3 Case Red"}'::jsonb)
> Rows Removed by Index Recheck: 4360296
> Heap Blocks: exact=37031 lossy=872059
> -> Bitmap Index Scan on json_tables_idx (cost=0.00..111.00 rows=10000
> width =0) (actual time=2141.250..2141.250 rows=909091 loops=1)
> Index Cond: (data @> '{"name": "AC3 Case Red"}'::jsonb)
> Planning time: 291.932 ms
> Execution time: 1259886.920 ms
> (8 rows)
>
> Time: 1261191.844 ms
>
> benchmark=# explain analyze select data from json_tables where data @>
> '{"name": "AC3 Case Red"}';
> QUERY PLAN
>
>
> ---------------------------------------------------------------------------------------------------------------------------------------
>
> Bitmap Heap Scan on json_tables (cost=113.50..37914.64 rows=10000
> width=1261) (actual time=779.261..29815.262 rows=909091 loops=1)
> Recheck Cond: (data @> '{"name": "AC3 Case Red"}'::jsonb)
> Rows Removed by Index Recheck: 4360296
> Heap Blocks: exact=37031 lossy=872059
> -> Bitmap Index Scan on json_tables_idx (cost=0.00..111.00 rows=10000
> width =0) (actual time=769.081..769.081 rows=909091 loops=1)
> Index Cond: (data @> '{"name": "AC3 Case Red"}'::jsonb)
> Planning time: 33.967 ms
> Execution time: 29869.381 ms
>
> (8 rows)
>
> Time: 29987.122 ms
>
>
> --
> Sent via pgsql-general mailing list (pgsql-general(at)postgresql(dot)org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general
>
--
--
Best Regards
Sameer Kumar | DB Solution Architect
*ASHNIK PTE. LTD.*

101 Cecil Street, #11-11 Tong Eng Building, Singapore 069 533

T: +65 6438 3504 | M: +65 8110 0350 | www.ashnik.com

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Kisung Kim 2016-07-19 03:19:49 Re: MongoDB 3.2 beating Postgres 9.5.1?
Previous Message Kisung Kim 2016-07-19 02:12:19 Re: MongoDB 3.2 beating Postgres 9.5.1?