Re: PostgreSQL VS MongoDB: a use case comparison

From: Stephen Frost <sfrost(at)snowman(dot)net>
To: Fabio Pardi <f(dot)pardi(at)portavita(dot)eu>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: PostgreSQL VS MongoDB: a use case comparison
Date: 2018-11-19 17:26:09
Message-ID: 20181119172609.GR3415@tamriel.snowman.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Greetings,

* Fabio Pardi (f(dot)pardi(at)portavita(dot)eu) wrote:
> We are open to any kind of feedback and we hope you enjoy the reading.

Looks like a lot of the difference being seen and the comments made
about one being faster than the other are because one system is
compressing *everything*, while PG (quite intentionally...) only
compresses the data sometimes- once it hits the TOAST limit. That
likely also contributes to why you're seeing the on-disk size
differences that you are.

Of course, if you want to see where PG will really shine, you'd stop
thinking of data as just blobs of JSON and actually define individual
fields in PG instead of just one 'jsonb' column, especially when you
know that field will always exist (which is obviously the case if you're
building an index on it, such as your MarriageDate) and then remove
those fields from the jsonb and just reconstruct the JSON when you
query. Doing that you'll get the size down dramatically.

And that's without even going to that next-level stuff of actual
normalization where you pull out duplicate data from across the JSON
and have just one instance of that data in another, smaller, table and
use a JOIN to bring it all back together. Even better is when you
realize that then you only have to update one row in this other table
when something changes in that subset of data, unlike when you
repeatedly store that data in individual JSON entries all across the
system and such a change requires rewriting every single JSON object in
the entire system...

Lastly, as with any performance benchmark, please include full details-
all scripts used, all commands run, all data used, so that others can
reproduce your results. I'm sure it'd be fun to take your json data and
create actual tables out of it and see what it'd be like then.

Thanks!

Stephen

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Mariel Cherkassky 2018-11-19 17:31:35 autovacuum is running but pg_stat_all_tables empty
Previous Message Fabio Pardi 2018-11-19 15:38:27 PostgreSQL VS MongoDB: a use case comparison