From: | "Enrico Thierbach" <eno(at)open-lab(dot)org> |
---|---|
To: | "Michael Lewis" <mlewis(at)entrata(dot)com> |
Cc: | "PostgreSQL General" <pgsql-general(at)lists(dot)postgresql(dot)org> |
Subject: | Re: partitions vs indexes |
Date: | 2019-10-03 10:29:34 |
Message-ID: | 7E29E5D6-56A8-4043-A35A-368430DD9190@open-lab.org |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
On 2 Oct 2019, at 22:16, Michael Lewis wrote:
> "I would like to convert a table with a primary key into a partitioned
> setup by a column which is not part of the primary key"
>
> That isn't possible. The partition key must be contained by the
> primary
> key. That is, the primary key could be site_id, id and you can create
> hash
> partition on id or site_id but not created_on.
>
> You could drop primary key and foreign keys and implement them via
> trigger
> functions as described in this blog series, but it seems questionable-
> https://www.depesz.com/2018/10/31/foreign-key-to-partitioned-table-part-2/
>
> I do not assume the restriction would be dropped in future releases. I
> don't know that scanning all the partitions to figure out whether the
> primary key is violated would be advisable. Which is what the trigger
> functions described in the blog post has to do, right?
>
> It might be noteworthy that partitioning with more than 10-100
> partitions
> is MUCH faster in PG12 than PG11 (up to 4-8 thousand partitions) from
> testing shared by those working on that code.
>
>>
Michael, thank you for your response. I think I now grasp the idea: if
there is a uniqueness constraint, then the database would rather not
visit all partitions to check for constraint violation, but want to
identify the single partition that might fail that; hence any
partitioning value must be a subset of or be identical to the uniqueness
constraint.
I get that this makes sense if you have *many* partitions; however, I
basically want to end up with two partitions, “hot” and “cold”.
A row’s lifetime starts in a hot partition, and, after being
processed, moves into the cold partition.
Most of the work actually happens in the hot partition, so I think
having this as small as possible is probably helpful. For numbers: the
hot partition would tyically contain ~10000 rows, the cold partition, on
the other hand, will have 10s of millions. At the same time I still want
to be able to look up a row by its id in the root relation, not in the
concrete partitions. Having the database validate a uniqueness
constraint in two tables instead of in one would be a worthwhile
sacrifice for me.
Having said that I just realized I could probably reach my goal by
setting up explicit hot and cold tables, move rows around manually
whenever their “hotness” changes, and set up a view which combines
both tables into a single relation. I would only have to drop all
explicit FK references from the schema. A downside, certainly, but one
that I could live with.
Best,
/eno
--
me on github: http://github.com/radiospiel
From | Date | Subject | |
---|---|---|---|
Next Message | Thomas Kellerer | 2019-10-03 10:40:20 | Re: Advice for geographically dispersed multi master |
Previous Message | Nikolai Lusan | 2019-10-03 09:05:04 | Advice for geographically dispersed multi master |