Re: When to store data that could be derived

From: Chris Travers <chris(dot)travers(at)gmail(dot)com>
To: Ron <ronljohnsonjr(at)gmail(dot)com>
Cc: "pgsql-generallists(dot)postgresql(dot)org" <pgsql-general(at)lists(dot)postgresql(dot)org>
Subject: Re: When to store data that could be derived
Date: 2019-03-24 14:53:38
Message-ID: CAKt_ZfvGpzRE4pZp2JJ2iwO9S3POLmO6CSCwRxrybd-NdOtPyw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Meant to send this to the list but hit the wrong button.

On Sun, Mar 24, 2019 at 9:45 AM Ron <ronljohnsonjr(at)gmail(dot)com> wrote:

> On 3/24/19 3:05 AM, Frank wrote:
> >
> >
> > On 2019-03-24 9:25 AM, Ron wrote:
> >> On 3/24/19 1:42 AM, Frank wrote:
> >>> Hi all
> >>>
> >>> As I understand it, a general rule of thumb is that you should never
> >>> create a physical column if the data could be derived from existing
> >>> columns. A possible reason for breaking this rule is for performance
> >>> reasons.
> >>>
> >>> I have a situation where I am considering breaking the rule, but I am
> >>> not experienced enough in SQL to know if my reason is valid. I would
> >>> appreciate it if someone could glance at my 'before' and 'after'
> >>> scenarios and see if, from a 'gut-feel' point of view, I should
> proceed.
> >>>
> >
> > [snip]
> >
> >>
> >> Sure the second query joins a lot of tables, but is pretty
> straightforward.
> >>
> >> What REALLY worries me is whether or not the query optimiser would look
> >> at the WHERE CASE, run away screaming and then make it use sequential
> >> scans. Thus, even query #1 would be slow.
> >>
> >
> > I had not realised that. I hope someone else chimes in on this.
>
> In every DBMS that I've used, the lside (left side) needs to be static
> (not
> "a" static) instead of variable (like a function).
>
> For example, this always leads to a sequential scan:
> WHERE EXTRACT(DAY FROM DATE_FIELD) = 5
>

PostgreSQL allows expression indexes

So you can:

create index foo on bar ((id % 1000));

And then use the index on:

select * from bar where id % 1000 = 45;

You could similarly

create index foo on bar (extract(day from date_field));

The left side needs to be indexed (and an immutable expression) but beyond
that.....

>
> >
> >>
> >> Is this a historical data set that's never updated, or current data
> >> that's constantly added to?
> >>
> >
> > It is the latter - current data constantly added to.
> >
> > Frank
> >
>
> --
> Angular momentum makes the world go 'round.
>
>

--
Best Wishes,
Chris Travers

Efficito: Hosted Accounting and ERP. Robust and Flexible. No vendor
lock-in.
http://www.efficito.com/learn_more

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Steve Baldwin 2019-03-24 19:37:58 Re: regr_slope returning NULL
Previous Message Peter J. Holzer 2019-03-24 12:41:09 Re: When to store data that could be derived