From: | Misa Simic <misa(dot)simic(at)gmail(dot)com> |
---|---|
To: | pgsql <pgsql-general(at)postgresql(dot)org> |
Subject: | PostgreSQL and VIEWS |
Date: | 2013-03-24 02:25:32 |
Message-ID: | CAH3i69=L2=LXE4YTxwJg3brC_Pv703+Mz2vH504qxcHWF3t5aw@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
HI,
When I have met PostgreSQL for a first time - I have been really amazed -
with many things...
But how we started to use it - and data jumps in - we meet performance
problems...
Now, it is a bit tricky... any concrete performance problem - can be solved
on some way...
However, I am more concerned, that "our way" how we do the things - is not
best way for Postgres...
The thing is - our tables are highly normalised - so lot of joins... (NULLS
do not exist in the tables)
to make things simplier we use VIEWS to "denormalise" data - though it is
not that much we care about denormalisation - it is more to make things
simplier and less error prone...
So every "thing" (Customer, product, transaction, employee, whatever) is
built up from more tables... I am not sure we have even one "thing" built
up just from 1 table...
Lot of "thing properties" are actually calculations: i.e. Invoice Amount,
or for Employee:
We have in the one table: first_name and last_name fields, but Full name is
concatented as:
last_name || ',' || first_name
So, whenever we need employee full name - is it on Employee Info window -
or in an Invoice as Salesperson, or in Payslip report...instead of to
everywhere have above formula - we have function...
but again, instead of to developer think about each possible function for
concrete thing - we have made for each entity - the view... what have all
relevant joins - and all relevant calculated properties... about the
thing...
I have had a thought - somewhere read (but now I am more convenient I have
read it wrong) that planner is that smart that if we we have:
CREATE VIEW person_view AS
SELECT person_id, first_name, last_name, fn_full_name(id) as full_name,
date_of_birth, age(date_of_birth) from person LEFT JOIN
person_date_of_birth USING (person_id)
SELECT first_name FROM person_view WHERE id = 1
Planner is smart and:
-will not care about joins - you ask for the field(s) - what do not belong
to other tables - both belong to 1 - and it is pk!
-will not care about functions - you havent asked for any field what is
function in your query...
However - how we met more and more problems with our approach... and
spending more time on
EXPLAIN ANALYZE - instead of on business problems... It seems things are
not that good...
for simple questions as above - results are acceptable - even looking into
EXPLAIN ANALYZE i would not say it is the best possible plan... (i.e.
planner spending time on Seq Scan on person_date_of_birth_table - and
filter it - even no need to think about that table at all - LEFT JOIN (show
me columns - if there is a matching row for pk column) - so could be check
via index -however - there is no any column from that table in the query -
I would simple discard that table from plan....
So query
SELECT id FROM view WHERE id = 5 (view is SELECT * FROM table1 LEFT JOIN
table2)
I would treat the same as:
SELECT id FROM table1 = 5
ok in INNER JOIN it requires additional confimration - but even there is FK
to PK join - that confirmation is not needed iether - but in our cases it
is always FK to PK...
However - if we need to involve more "entities"/views - from some unknown
reason to me - postgres always picks bad plan...
i.e. list of employees what work in New York
we have employees_contract table:
contract_id, person_id, department_id,
a lot of others tables, but to make it shorter:
Department_view
Buidlings_view
and now query:
SELECT full_name FROM person_view INNER JOIN emplyee_contract USING
(person_id) INNER JOIN department_view USING (department_id) INNER JOIN
buildings_view USING (building_id) WHERE city_id = 'NY'
from some unknown reason - gives bad plan - then if we "refactor" query and
send different question - we get good result... I am pretty sure planner
should be capable to "rephrase" my question instead of me...
I would like to hear your expirience with VIEWS in postgres... And some
kind of best practice/advice for described situation... So far it looks to
me there is no way - to make things ready for any specific question - every
"request" will need specific SQL syntax to drive planner in acceptable
direction...
Many thanks,
Misa...
From | Date | Subject | |
---|---|---|---|
Next Message | Serge Fonville | 2013-03-24 02:37:27 | Re: PostgreSQL and VIEWS |
Previous Message | Chris Angelico | 2013-03-24 00:42:28 | Re: Rewritten rows on unchanged values |