Quick Links

Re: Tweaking Foreign Keys for larger tables

From:	David G Johnston <david(dot)g(dot)johnston(at)gmail(dot)com>
To:	pgsql-hackers(at)postgresql(dot)org
Subject:	Re: Tweaking Foreign Keys for larger tables
Date:	2014-11-06 08:11:22
Message-ID:	1415261482441-5825891.post@n5.nabble.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

Peter Eisentraut-2 wrote
> On 10/31/14 6:19 AM, Simon Riggs wrote:
>> Various ways of tweaking Foreign Keys are suggested that are helpful
>> for larger databases.
>
>> * INITIALLY NOT ENFORCED
>> FK created, but is not enforced during DML.
>> Will be/Must be marked NOT VALID when first created.
>> We can run a VALIDATE on the constraint at any time; if it passes the
>> check it is marked VALID and presumed to stay that way until the next
>> VALIDATE run.
>
> Does that mean the FK would become invalid after every DML operation,
> until you expicitly revalidate it? Is that practical?

My read is that it means that you can insert invalid data but the system
will pretend it is valid unless someone asks it for confirmation. Upon
validation the FK will become invalid until the discrepancy is fixed and
another validation is performed.

>> ON DELETE IGNORE
>> ON UPDATE IGNORE
>> If we allow this specification then the FK is "one way" - we check the
>> existence of a row in the referenced table, but there is no need for a
>> trigger on the referenced table to enforce an action on delete or
>> update, so no need to lock the referenced table when adding FKs.
>
> Are you worried about locking the table at all, or about having to lock
> many rows?

Wouldn't you at least need some kind of trigger to make the constraint
invalid as soon as any record is updated or removed from the referenced
table since in all likelihood the FK relationship has just been broken?

How expensive is validation going to be? Especially, can validation occur
incrementally or does every record need to be validated each time?

Is this useful for master-detail setups, record-category, or both (others?)?

Will optimizations over invalid data give incorrect answers and in what
specific scenarios can that be expected?

I get the idea of having a system that let's you skip constant data
validation since in all likelihood once in production some scenarios would
be extremely resistant to the introduction of errors and can be dealt with
on-the-fly. Trust only since the verify is expensive - but keep the option
open and the model faithfully represented.

I don't know that I would ever think to use this in my world since the
additional admin effort is obvious but the cost of the thing I'd be avoiding
is vague. As it is now someone could simply drop their FK constraints and
run a validation query periodically to see if the data being inserted is
correct. That doesn't allow for optimizations to take place though and so
this is an improvement; but the documentation and support aspects for a
keep/drop decision can be fleshed out first as that would be valuable in its
own right. Then go about figuring out how to make a hybrid implementation
work.

Put another way: at what point does the cost of the FK constraint outweigh
the optimization savings? While size is obvious both schema and read/write
patterns likely have a significant influence.

David J.

--
View this message in context: http://postgresql.1045698.n5.nabble.com/Tweaking-Foreign-Keys-for-larger-tables-tp5825162p5825891.html
Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.

In response to

Re: Tweaking Foreign Keys for larger tables at 2014-11-05 21:15:38 from Peter Eisentraut

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Simon Riggs	2014-11-06 08:49:56	Re: Tweaking Foreign Keys for larger tables
Previous Message	Andres Freund	2014-11-06 07:50:45	Re: tracking commit timestamps