Re: My honours project - databases using dynamically attached entity-properties

From: Richard Huxton <dev(at)archonet(dot)com>
To: Sean Utt <sean(at)strateja(dot)com>
Cc: Gregory Stark <stark(at)enterprisedb(dot)com>, Andrew Hammond <andrew(dot)george(dot)hammond(at)gmail(dot)com>, josh(at)agliodbs(dot)com, pgsql-hackers(at)postgresql(dot)org, Edward(dot)Stanley(at)mcs(dot)vuw(dot)ac(dot)nz
Subject: Re: My honours project - databases using dynamically attached entity-properties
Date: 2007-03-13 09:31:45
Message-ID: 45F66F81.7090201@archonet.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Sean Utt wrote:
> And then what? Make the search box on www.postgresql.org able to handle
> an email address as search text without throwing a shoe?
>
> Search for linus(at)torvalds(dot)com or any other 'email' address from the
> postgres home page. Barfage every time.
> Easy for some isn't easy for all, apparently. Left that out as a test
> case did we? Someone searching a mailing list for an email address? Who
> wudda thunk it? It works without the . -- don't know why, but then I
> also don't know why someone hasn't tried that before me.

Had a bad day?

And I'll be damned if you're not right - it doesn't return results for
dev(at)archonet(dot)com but does for dev(at)archonet(dot) Presumably something to do
with (not ?) splitting the email address on ".". Can't believe no-one
has noticed this before (me, for example). I guess that even though I
search a lot, it's not on email addrs.

Have you reported it to the www team?

> "Sure, sounds like a simple solution to me..." Richard said
> sarcastically. Would be funnier if the search on the website wasn't
> broken in a completely stupid, almost ironic way. Ah, irony and sarcasm
> -- the ugly twins.

Actually, it was Greg who said that. And it was *me* the (really very
gentle) sarcasm was directed at.

> Yeah, we have to dynamically generate queries day in and day out. But
> then some of us actually work for a living.

Umm - like all of us?

> Since we already have to do that, maybe someone could make that easier?

Good idea!

> Isn't that really the point here?

Not as I was reading the discussion.

> Someone asked if something would be
> useful, and the people who use the database to do real work said YES,
> and here's how I might use it. Like full text seach and recursive
> queries, user defined (fields|attributes|properties) and the ability to
> manage them would be BUTTER! Is it a difficult problem? YES, but if it
> wasn't, why should it be worth an advanced degree?

I think the main discussion has been around:
1. Whether Edward's final-year project is basically EAV (in which case
he'll probably need to work hard to get good marks).
2. Whether dynamically generating DDL is safe/practical in a business
setting. This seems to split some *very* experienced people about 50:50.
Half of them agree with me and the other half are inexplicably wrong ;-)

If you read section 0.3 of Edward's project proposal he argues that
dynamic DDL causes problems for the application because: "However, SQL
does not provide an easy syntax for querying these properties." (meaning
the changed structure of the database). I'm not saying this part is
easy, but I'm not convinced it's harder than doing it the "wrong" way.
At least not if you do as Edward does and enforce types.

Now, in 0.3.1 he describes a normalised webpage=>tags table pair and
shows some cumbersome-looking SQL. However, I'd argue this is due to the
verbose nature of the SQL rather than the underlying expressions.

He then looks at what I think of as the "system settings table"*
problem, where you have a bunch of configuration-settings you'd tend to
store in a single table (setting_name, value), except you'd like to have
different types for each setting (a further wrinkle is that you might
like lists of settings - do you use arrays or a sub-table?). This is
your classic EAV scenario.

Now, he proposes a set of tables - one per property, so you can enforce
type constraints, but he will need to create syntax to make this easier
to query. Presumably it'll need an automatically-generated view over the
top. (Edward - tip: read up on Date's thoughts on automatically
determining what table you can insert a row in based on its type).

This certainly looks like a variant on EAV to me, and I'm not convinced
that it's gaining much since you'll have possibly huge numbers of joins
going on in the background while not having any semantics to your table
definitions.

The other variant (which I've used myself) is to have a "type" column
and a trigger to enforce the text-value is valid for "type".

Now, if you do it "properly", that is define tables as you would as a
human designer, then you do have the problem of informing your
application on the relevant structure. Now, the syntax issues of this
are largely solved - plenty of ActiveRecord-style classes out there for
Ruby,Python,Perl,PHP,...

The interesting question then becomes where the practical limits in such
flexibility are. Simple attribute/value pairs are easy enough. How do
you feel about changes to existing tables though? How about new tables
that add relationships between existing tables? At what point does the
application just become pgAdmin?

* Another good example is the "questionnaire". Users need to be able to
define their own lists of questions and if "driving_licence=no" then
don't bother to ask "type of car=?". Edward - if you want to see a
schema that implements this sort of thing, contact me off list and I'll
see what I can do. The client for that project will probably be fine
with sharing it with one student.

--
Richard Huxton
Archonet Ltd

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Simon Riggs 2007-03-13 09:37:20 Re: Bug: Buffer cache is not scan resistant
Previous Message Csaba Nagy 2007-03-13 09:27:36 Re: My honours project - databases using dynamically attached entity-properties