Re: Declarative partitioning

From: Corey Huinker <corey(dot)huinker(at)gmail(dot)com>
To: Amit Langote <Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp>
Cc: Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, Robert Haas <robertmhaas(at)gmail(dot)com>
Subject: Re: Declarative partitioning
Date: 2016-01-22 18:42:01
Message-ID: CADkLM=e-5Dwic4tz9E6Q456iZj8xiPpbB4FUjsuKFha-=YeCAw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

>
>
> So for now, you create an empty partitioned table specifying all the
> partition keys without being able to define any partitions in the same
> statement. Partitions (and partitions thereof, if any) will be created
> using CREATE PARTITION statements, one for each.
>

...and I would assume that any attempt to insert into a partitioned table
with no partitions (or lacking partitions at a defined level) would be an
error? If so, I'd be ok with that.

> Specifying range partitioning bound as PARTITION FOR RANGE <range-literal>
> sounds like it offers some flexibility, which can be seen as a good thing.
> But it tends to make internal logic slightly complicated.
>
> Whereas, saying PARTITION FOR VALUES LESS THAN (max1, max2, ...) is
> notationally simpler still and easier to work with internally. Also, there
> will be no confusion about exclusivity of the bound if we document it so.

I understand wanting the internal rules to be simple. Oracle clearly went
with VALUES LESS THAN waterfalls for that reason.

What I'm hoping to avoid is:
- having to identify my "year2014" partition by VALUES LESS THAN
'2015-01-01', a bit of cognitive dissonance defining data by what it's not.
- and then hoping that there is a year2013 partition created by someone
with similar sensibilities, the partition definition being incomplete
outside of the context of other partition definitions.
- and then further hoping that nobody drops the year2013 partition, thus
causing new 2013 rows to fall into the year2014 partition, a side effect of
an operation that did not mention the year2014 partition.

Range types do that, and if we're concerned about range type overhead,
we're only dealing with the ranges at DDL time, we can break down the ATR
rules into a more easily digestible form once the partition is modified.

Range continuity can be tested with -|-, but we'd only need to test for
overlaps: gaps in ranges are sometimes a feature, not a bug (ex: I don't
want any rows from future dates and we weren't in business before 1997).

Also, VALUES LESS THAN forces us to use discrete values. There is no way
with to express with VALUES LESS THAN partitions that have float values for
temperature:
ice (,0.0), water [0.0,212.0], steam (212.0,3000.0], plasma (3000.0,).

Yes, I can calculate the day after the last day in a year, I can use
212.0000000001, I can write code to rigorously check that all partitions
are in place. I'd just rather not.

p.s. I'm really excited about what this will bring to Postgres in general
and my organization in particular. This feature alone will help chip away
at our needs for Vertica and Redshift clusters. Let me know if there's
anything I can do to help.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Daniel Verite 2016-01-22 18:53:57 Re: [patch] Proposal for \crosstabview in psql
Previous Message Magnus Hagander 2016-01-22 18:26:38 Re: Releasing in September