From: | Anssi Kääriäinen <anssi(dot)kaariainen(at)thl(dot)fi> |
---|---|
To: | Thomas Kellerer <spam_eater(at)gmx(dot)net> |
Cc: | "pgsql-performance(at)postgresql(dot)org" <pgsql-performance(at)postgresql(dot)org> |
Subject: | Re: Disabling nested loops - worst case performance |
Date: | 2011-03-18 12:19:08 |
Message-ID: | 4D834DBC.3090009@thl.fi |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-performance |
On 03/18/2011 01:14 PM, Thomas Kellerer wrote:
> Did you consider using hstore instead?
>
> I think in the PostgreSQL world, this is a better alternative than EAV and most probably faster as well.
No, we did not. The reason is that we want to track each attribute with
bi-temporal timestamps. The actual database schema for the attribute
value table is:
CREATE TABLE attr_value (
id SERIAL PRIMARY KEY,
olio_id INTEGER NOT NULL REFERENCES base_olio, -- entity identifier
attr_tunniste VARCHAR(20) NOT NULL REFERENCES base_attr, -- attr
identifier
kieli_tunniste VARCHAR(20) REFERENCES kieli, -- lang identifier
arvo_number DECIMAL(18, 9), -- value number
arvo_ts timestamptz, -- value timestamp
arvo_text TEXT, -- value text
arvo_valinta_tunniste VARCHAR(20), -- for choice lists:
"value_choice_identifier"
real_valid_from TIMESTAMPTZ NOT NULL, -- real_valid_from -
real_valid_until define when things have been in "real" world
real_valid_until TIMESTAMPTZ NOT NULL,
db_valid_from TIMESTAMPTZ NOT NULL, -- db_valid_* defines when
things have been in the database
db_valid_until TIMESTAMPTZ NOT NULL,
tx_id_insert INTEGER default txid_current(),
tx_id_delete INTEGER,
-- foreign keys & checks skipped
);
Naturally, we have other tables defining the objects, joins between
objects and metadata for the EAV. All data modifications are done
through procedures, which ensure uniqueness etc. for the attributes and
joins.
The data set is small, and performance in general is not that important,
as long as the UI is responsive and data can be transferred to other
systems in reasonable time. Insert performance is at least 10x worse
than when using traditional schema, but it doesn't matter (we have
somewhere around 1000 inserts / updates a day max). The only real
problem so far is the chained nested loop problem, which really kills
performance for some queries.
Surprisingly (at least to me) this schema has worked really well,
although sometimes there is a feeling that we are implementing a
database using a database...
- Anssi
From | Date | Subject | |
---|---|---|---|
Next Message | Arjen van der Meijden | 2011-03-18 12:44:51 | Re: Request for feedback on hardware for a new database server |
Previous Message | Jeff | 2011-03-18 12:14:57 | Re: Xeon twice the performance of opteron |