Re: Dynamic Partitioning using Segment Visibility Maps

From: Gavin Sherry <swm(at)alcove(dot)com(dot)au>
To: Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc: Chris Browne <cbbrowne(at)acm(dot)org>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Dynamic Partitioning using Segment Visibility Maps
Date: 2008-01-09 22:48:03
Message-ID: 20080109224803.GC999@europa.idg.com.au
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Jan 09, 2008 at 08:17:41PM +0000, Simon Riggs wrote:
> On Wed, 2008-01-09 at 20:03 +0100, Gavin Sherry wrote:
>
> > I think Simon's approach is
> > probably more complex from an implementation POV.
>
> Much of the implementation is exactly the same, and I'm sure we agree on
> more than 50% of how this should work already. We just need to close in
> on the remainder.
>
> My current opinion on the SE approach is the opposite of the one above,
> though it gets us nowhere just to state it. I'm trying to avoid opinion
> and look at the details, which is the reason my viewpoint recently
> changed in favour of the dynamic approach as the main thrust for
> implementation.
>
> I've written a detailed email this morning to explain where and how the
> problems lie, which are nowhere near the syntax level. I haven't ruled
> out a declarative approach yet, but I need some detailed technical
> review of the issues. I hope you'll be replying to that?

I'm responding to that email seperately. Just a matter or addressing all
the points in the detail it deserves.

>
> > > 3. If, rather than blindly following, we create something at least
> > > quasi-new, there is the chance of doing fundamentally better.
> > >
> > > This very thing happened when it was discovered that IBM had a
> > > patent on the ARC cacheing scheme; the "clock" system that emerged
> > > was a lot better than ARC ever was.
> >
> > Well, I don't think I'm proposing we /blindly follow/ others. I propose
> > we choose a grammar which takes the best of what others have tried to
> > do. Oracle's grammar is hideous, IBM's is too restrictive, for example.
>
> I assume the new grammar is good and if we do go that way, it sounds
> like the right starting place.
>
> > > > One major advantage of the dynamic approach is that it can work on
> > > > multiple dimensions simultaneously, which isn't possible with
> > > > declarative partitioning. For example if you have a table of Orders then
> > > > you will be able to benefit from Segment Exclusion on all of these
> > > > columns, rather than just one of them: OrderId, OrderDate,
> > > > RequiredByDate, LastModifiedDate. This will result in some "sloppiness"
> > > > in the partitioning, e.g. if we fill 1 partition a day of Orders, then
> > > > the OrderId and OrderData columns will start out perfectly arranged. Any
> > > > particular RequiredByDate will probably be spread out over 7 partitions,
> > > > but thats way better than being spread out over 365+ partitions.
> > >
> > > I think it's worth observing both the advantages and demerits of this
> > > together.
> > >
> > > In effect, with the dynamic approach, Segment Exclusion provides its
> > > benefits as an emergent property of the patterns of how INSERTs get
> > > drawn into segments.
> > >
> > > The tendancy will correspondly be that Segment Exclusion will be able
> > > to provide useful constraints for those patterns that can naturally
> > > emerge from the INSERTs.
> >
> > Many people, in my experience, doing the kind of data processing which
> > benefits from partitioning are regularly loading data, rather than
> > collecting it in an OLTP fashion. Lets take the easily understandable
> > concept of processing web site traffic. If the amount of data is large
> > enough to benefit from partitioning, they probably have multiple web
> > servers and therefore almost certainly multiple log files. If these
> > files are not sorted into a single file, the records will not have a
> > naturally progressing chronology: every file we go back to the beginning
> > of the period of time the load covers. If you add parallelism to your load,
> > things look even more different. This means you could end up with a
> > bunch of partitions, under the dynamic model, which all cover the same
> > time range.
>
> Depends how big we make the partitions and how sloppy this is as to
> whether that is a problem or not. We might still expect a x100 gain from
> using the SE approach depending upon the data volume.

This comes back to my problem with the dynamic approach: the user might
get a certain experience but with the declarative approach the
expectation matches the result.

>
> > Then there's the way that really big databases are used (say, up around
> > Simon's upper bound of 16 TB). It is expensive to keep data online so
> > people aren't. They're loading and unloading data all the time, to
> > perform different forms of analysis.
>
> That isn't my experience. That sounds very time consuming.

I cannot offer any evidence but it's what we see companies doing.

> The storage cost issue was the reason Andrew wanted offline segments,
> and why I have been talking about hierarchical storage.

The thing is, people with lots of data usually have good hierarchical
storage systems. Can we really do better?

>
> > A common scenario in the example
> > above might be to unload all but the current month's data and then load
> > the same month from the previous year. The unload step needs to be
> > costly (i.e., TRUNCATE). Then, there's no guarantee that what they're
> > interested in is the date range at all. They may want to compare user
> > agent (look at bot activity). In this case, the partitioning is across a
> > small list of strings (well, numbers most likely). Here, the partitions
> > would all have the same range. Partitioning would be useless.
>
> I take it you mean SE-based partitioning would be useless, but
> declarative partitioning would be useful? I would agree, assuming they
> run queries with a few of the small list of strings. Seems like a
> contrived case.

I don't think it's contrived. How about the use case else where in the
thread which is based on post code?

>
> > Some of the biggest I've seen are GIS information or network
> > information.
>
> Those are good examples of where a declarative approach would be the
> only way of getting partitioning to work well. So based on that I'll
> agree that some people *need* a declarative approach to get the benefits
> of partitioning, rather than just want.
>
> We do already have constraint exclusion. Can we get by with constraint
> and segment exclusion together?

The reason we are (Greenplum) looking at better partitioning is that
using the existing approach is too error prone, too different to other
vendors, too difficult to management and too hard to reconfigure.

>
> > Also, what about people with there own data types? It may
> > be that the data types do not support ordering so range partitioning
> > does not make sense.
>
> I'd like to hear some examples. Oleg?
>
> > There's also a cost involved with calculating all this in the dymanic
> > approach. Consider tables with a non-trivial number of columns.
>
> True, but that balances the cost of doing that at load time.

I agree, but I'll put more detail into the other email.

Thanks,

gavin

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Gavin Sherry 2008-01-09 22:52:09 Re: Named vs Unnamed Partitions
Previous Message Gavin Sherry 2008-01-09 22:36:43 Re: Dynamic Partitioning using Segment Visibility Maps