PostgreSQL and VIEWS

From: Misa Simic <misa(dot)simic(at)gmail(dot)com>
To: pgsql <pgsql-general(at)postgresql(dot)org>
Subject: PostgreSQL and VIEWS
Date: 2013-03-24 02:25:32
Message-ID: CAH3i69=L2=LXE4YTxwJg3brC_Pv703+Mz2vH504qxcHWF3t5aw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

HI,

When I have met PostgreSQL for a first time - I have been really amazed -
with many things...

But how we started to use it - and data jumps in - we meet performance
problems...

Now, it is a bit tricky... any concrete performance problem - can be solved
on some way...

However, I am more concerned, that "our way" how we do the things - is not
best way for Postgres...

The thing is - our tables are highly normalised - so lot of joins... (NULLS
do not exist in the tables)

to make things simplier we use VIEWS to "denormalise" data - though it is
not that much we care about denormalisation - it is more to make things
simplier and less error prone...

So every "thing" (Customer, product, transaction, employee, whatever) is
built up from more tables... I am not sure we have even one "thing" built
up just from 1 table...

Lot of "thing properties" are actually calculations: i.e. Invoice Amount,
or for Employee:
We have in the one table: first_name and last_name fields, but Full name is
concatented as:

last_name || ',' || first_name

So, whenever we need employee full name - is it on Employee Info window -
or in an Invoice as Salesperson, or in Payslip report...instead of to
everywhere have above formula - we have function...
but again, instead of to developer think about each possible function for
concrete thing - we have made for each entity - the view... what have all
relevant joins - and all relevant calculated properties... about the
thing...

I have had a thought - somewhere read (but now I am more convenient I have
read it wrong) that planner is that smart that if we we have:

CREATE VIEW person_view AS
SELECT person_id, first_name, last_name, fn_full_name(id) as full_name,
date_of_birth, age(date_of_birth) from person LEFT JOIN
person_date_of_birth USING (person_id)

SELECT first_name FROM person_view WHERE id = 1

Planner is smart and:
-will not care about joins - you ask for the field(s) - what do not belong
to other tables - both belong to 1 - and it is pk!
-will not care about functions - you havent asked for any field what is
function in your query...

However - how we met more and more problems with our approach... and
spending more time on
EXPLAIN ANALYZE - instead of on business problems... It seems things are
not that good...

for simple questions as above - results are acceptable - even looking into
EXPLAIN ANALYZE i would not say it is the best possible plan... (i.e.
planner spending time on Seq Scan on person_date_of_birth_table - and
filter it - even no need to think about that table at all - LEFT JOIN (show
me columns - if there is a matching row for pk column) - so could be check
via index -however - there is no any column from that table in the query -
I would simple discard that table from plan....

So query

SELECT id FROM view WHERE id = 5 (view is SELECT * FROM table1 LEFT JOIN
table2)

I would treat the same as:
SELECT id FROM table1 = 5

ok in INNER JOIN it requires additional confimration - but even there is FK
to PK join - that confirmation is not needed iether - but in our cases it
is always FK to PK...

However - if we need to involve more "entities"/views - from some unknown
reason to me - postgres always picks bad plan...

i.e. list of employees what work in New York

we have employees_contract table:
contract_id, person_id, department_id,

a lot of others tables, but to make it shorter:

Department_view

Buidlings_view

and now query:
SELECT full_name FROM person_view INNER JOIN emplyee_contract USING
(person_id) INNER JOIN department_view USING (department_id) INNER JOIN
buildings_view USING (building_id) WHERE city_id = 'NY'

from some unknown reason - gives bad plan - then if we "refactor" query and
send different question - we get good result... I am pretty sure planner
should be capable to "rephrase" my question instead of me...

I would like to hear your expirience with VIEWS in postgres... And some
kind of best practice/advice for described situation... So far it looks to
me there is no way - to make things ready for any specific question - every
"request" will need specific SQL syntax to drive planner in acceptable
direction...

Many thanks,

Misa...

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Serge Fonville 2013-03-24 02:37:27 Re: PostgreSQL and VIEWS
Previous Message Chris Angelico 2013-03-24 00:42:28 Re: Rewritten rows on unchanged values