Re: PostgreSQL suitable?

From: Stephen Frost <sfrost(at)snowman(dot)net>
To: Vincenzo Romano <vincenzo(dot)romano(at)notorand(dot)it>
Cc: Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, Andreas Kretschmer <andreas(at)a-kretschmer(dot)de>, pgsql-general(at)lists(dot)postgresql(dot)org
Subject: Re: PostgreSQL suitable?
Date: 2017-12-19 17:24:03
Message-ID: 20171219172403.GO4628@tamriel.snowman.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Greetings,

* Vincenzo Romano (vincenzo(dot)romano(at)notorand(dot)it) wrote:
> 2017-12-19 17:32 GMT+01:00 Stephen Frost <sfrost(at)snowman(dot)net>:
> > * Vincenzo Romano (vincenzo(dot)romano(at)notorand(dot)it) wrote:
> >> 2017-12-19 16:48 GMT+01:00 Stephen Frost <sfrost(at)snowman(dot)net>:
> >> > * Vincenzo Romano (vincenzo(dot)romano(at)notorand(dot)it) wrote:
> >> >> Sorry, my bad: I confused V10 with v11.
> >> >> But accordingly to a discussion with Bruce Momjan, table partitionin V10 is
> >> >> little more than syntactic sugar around old-fashioned table partitioning.
> >> >
> >> > Well, it's a bit more than that since there's tuple-routing, but you're
> >> > right that the partition elimination is the same as it was in earlier
> >> > versions and based on constraint exclusion. That said, as noted in the
> >> > email you replied to, reasonable numbers of partitions aren't too bad
> >> > even with the planning cost; it's when you have many thousands of
> >> > partitions that you get into cases where planning time for queries is
> >> > really bad.
> >>
> >> When you have to handle a 100TB table, the number of partitions
> >> shouldn't be 10 or 12
> >> as seen in most examples and tests.
> >
> > Probably not, but it would depend on the data and what you're doing with
> > it. While there are some general rules-of-thumb, there's no one right
> > answer when it comes to the size of individual partitions.
>
> I think partitioning is done in order to make smaller tables or to
> balance table size and table number.
> You currently have to keep in mind a limit to the number of sub-tables.
> This makes partitioning less effective.

Sure, but that doesn't make PG unsuitable for such use-cases, just that
you have to realize these costs and address them.

> >> This is the same type of issues you hit with partial indexes (this is
> >> why I mentioned them earlier).
> >> Sub-table (and partial index) selection algorithm should be
> >> logarithmic or sub-linear.
> >
> > Sure, and work is being done to improve PG in that exact area, but that
> > doesn't mean it can't handle workloads like this today, but you have to
> > be aware of the costs associated with today's partitions.
> >
> >> As long as it'll be linear, you'll hear about "reasonable number of partitions".
> >> One thousand partitions for a 100TB table would make "manageable"
> >> 100GB sub-tables.
> >
> > Yup, and 100G tables are certainly large and a bit awkward but they can
> > be managed.
>
> Of course. But I would bet they are still considered as "very large tables".

They're not small. :)

> >> I could be easily wrong, but this is an are where PG needs improvements.
> >
> > Of course, and work is being done to improve it. What's important is
> > knowing that there's a cost to having more partitions when querying
> > through the parent when you get to a point where you have thousands of
> > partitions. That cost may be perfectly fine in some use-cases and in
> > others it might not be, and instead you'd likely have to build logic
> > into the application layer to address it. That's not ideal, which is
> > why there's work being done to improve PG, but it's not necessairly a
> > big deal either. Dynamic SQL is another approach.
> >
> >> One could maybe think about multi-level partitioning, though.
> >
> > Not sure exactly what you're referring to here, but a simple multi-level
> > partitioning setup with PG wouldn't actually change the cost for
> > partition elimination today.
>
> It'd be a tree of tables, pointing to logarithmic selection.
> This is why we love tree data structures.

Sure, but that doesn't matter when it comes to how constraint exclusion
works with PostgreSQL today- all partitions under a given parent are
considered when doing partition elimination.

> >> > Also as noted on this thread, PG could handle this data volume, but to
> >> > be efficient there would be work to be done in normalization,
> >> > aggregation, and analyzing the system to ensure you're storing and
> >> > querying on the data efficiently.
> >>
> >> Normalization will grow the number of tables (and later joins) and
> >> you'll will very likely end up with at least a table with a
> >> "gazillion" rows.
> >
> > Natuarlly, but that "gazillion" rows table would be much smaller for
> > having the data normalized- if you don't normalize it then the gazillion
> > row table is a huge amount of duplicated data, making the entire system
> > much larger than necessary.
>
> Partitioning is done for other reasons than de-duplicating data.

Yes, I wasn't suggesting to use partitioning to de-duplicate data, I was
suggesting to use normalization for that.

> The number of rows to be analyzed would be still the same and the
> indexes over those columns would still be rather large.

While true, the indexes would be smaller than they would be without the
data being normalized.

> >> I fear normalization, provided it's really needed, would provide little help.
> >
> > I seriously doubt that's the case. Normalization might reduce that
> > 100TB down to 10's of TB instead, or perhaps even smaller.
>
> A 10x duplication factor seems very bad to me.

I'm not sure why, it's not actually that uncommon.

> If that's the case then normalization would provide for some
> improvement while eating resources to be done.

It's unclear what you're referring to here regarding 'eating resources'.
Clearly there's some CPU time that's consumed when doing a join, but
that's generally far cheaper than the time required to pull data off of
disk because it's not in memory.

> Storage nowadays isn't a big issue. Querying is.

Sure, and if all you had to pay was the storage cost then no one would
care- but you also have to pay the memory cost and that's a much bigger
deal. If you have to go out to disk constantly then you've basically
already lost when it comes to query time.

> >> With tables that big I usually do "software partitioning".
> >> I make the software aware of the partition schema so it can direct the
> >> queries to the needed (sub-)tables.
> >
> > Yes, implementing partition elimination at the application level is
> > certainly an approach that can work with PG. Even then, ideally, you'd
> > normalize the data so that the footprint is much smaller, making more
> > efficient use of the storage and memory available.
>
> Normalizing can save resources that you later need to join tables back.

You were commenting above on how normalization eats resources and now
it's saving them. :) In my experience, the join cost is nothing
compared to the cost to pull data off of disk because you couldn't fit
it all in memory.

> It's a matter of trade-offs and it strictly depends on the queries to
> be run in my opinion.

I agree that it's a matter of trade-offs and certainly depends on the
queries being run, though none of that information is available here.

> Effective table partitioning, instead, seems to me to have bigger importance.

They're both important, imv.

> My bottom line here is: table partitioning is nice, but not ready yet
> for the big data where it makes the bigger sense.

I agree that there's more work to be done there, certainly, but it's
also getting better and there are ways to work around the existing
limitations.

Thanks!

Stephen

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Kevin Burke 2017-12-19 19:24:06 Debugging a function - what's the best way to do this quickly?
Previous Message David G. Johnston 2017-12-19 17:06:44 Re: AWS Aurora and PG 10