Re: Date vs Timestamp without timezone Partition Key

From: Cedric Leong <cedricleong(at)gmail(dot)com>
To: David Rowley <dgrowleyml(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Pgsql Performance <pgsql-performance(at)lists(dot)postgresql(dot)org>
Subject: Re: Date vs Timestamp without timezone Partition Key
Date: 2020-06-06 02:49:46
Message-ID: CAD6i=X3f96WXKYwALc6CBHsnDDi5z8nE76h0y37uf+uBo1Rc4w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

It's less of a complaint rather than just a warning not to do what I did.

Version:
PostgreSQL 11.6 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.9.3, 64-bit

The actual command list would probably be impractical to put in here just
because the majority of it would just be creating a large amount of
partition tables. But in summary what i've done is basically this:
Existing database has a partitioned fact table
1. Create an exact clone of that partitioned fact table which includes all
the same indexes, columns, and partitioned tables
2. Change the partitioned table's partition key from an indexed date column
to an indexed timestamp without timezone column
3. Do an insert into from the old partitioned fact table to the new
partitioned fact table which includes all the same rows (insert into since
i wanted the timestamp without timezone column to be in a new timezone)
4. Switch the names of the tables so the new one will be the one that's used
5. VACUUM FULL; ANALYZE;

For my use case which is a data warehouse star schema, this fact table is
basically the base table of every report. To be more specific, the reports
I've tested on varied from 2x slower to 4x slower. From what I see so far
that's because the query plan is drastically different for both. An example
of this test would look like this: https://explain.depesz.com/s/6rP8 and
https://explain.depesz.com/s/cLUY
These tests are running the exact same query on two different tables with
the exception that they use their respective partition keys.

On Fri, Jun 5, 2020 at 10:17 PM David Rowley <dgrowleyml(at)gmail(dot)com> wrote:

> On Sat, 6 Jun 2020 at 14:12, Cedric Leong <cedricleong(at)gmail(dot)com> wrote:
> > Somewhat unrelated but note to anyone who wants to swap out partition
> keys. Don't create a clone of the table with the new partition key and
> insert data. It messes up the query planner massively and makes everything
> much slower.
>
> That complaint would have more meaning if you'd mentioned which
> version of PostgreSQL you're using. The performance of partitioning in
> PostgreSQL has changed significantly over the past 3 releases. Also
> would be useful to know what you've actually done (actual commands).
> I can't imagine it makes *everything* slower, so it might be good to
> mention what is actually slower.
>
> David
>

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message David Rowley 2020-06-06 03:13:40 Re: Date vs Timestamp without timezone Partition Key
Previous Message David Rowley 2020-06-06 02:16:50 Re: Date vs Timestamp without timezone Partition Key