Re: Using hashtext and unique constraints together

From: "Daniel Verite" <daniel(at)manitou-mail(dot)org>
To: "Mason Hale" <masonhale(at)gmail(dot)com>
Cc: "pgsql-general" <pgsql-general(at)postgresql(dot)org>
Subject: Re: Using hashtext and unique constraints together
Date: 2007-12-12 11:23:18
Message-ID: b24b2c2f-14f7-45b2-895d-5760b4be8a97@mm
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Mason Hale wrote:

> The problem I need help with is guaranteeing uniqueness of the URL on
> insert, with a non-unique index on hashtext(url) and *without* creating a
> unique index on page(url).
>
> I'm thinking that an insert trigger that ensures (SELECT count(*) FROM page
> WHERE hashtext(url) = hashtext('http://www.postgresql.org') AND url = '
> http://www.postgresql.org' ) = 0 won't work given MVCC, as two transactions
> could simultaneously insert the same url at the same time.

Suppose that before that SELECT, the transaction would insert the url
into a small dedicated table, with a unique index on the url column.
If a second transaction starts with the same url value before the first one
commits, it will be blocked at the point of the insert, because of that
unique index. And then it will fail with a constraint violation if and when
the first one commits. And even if it didn't, the SELECT count(*) above
would return non-zero.
What makes this table "small" is that each row in it can be discarded as
soon as its corresponding transaction has been committed, since after that
point, the SELECT count(*) inside your trigger has enough visibility to take
care of the unicity check.
So there could be a cron job periodically deleting all rows from the small
table (that is, all committed rows). That would keep its size small, otherwise
of course you'd be eventually back to the original problem of having that
big index.

It's certainly not elegant, but I believe it would work.

--
Daniel
PostgreSQL-powered mail user agent and storage: http://www.manitou-mail.org

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Simon Riggs 2007-12-12 11:30:25 Re: Slow PITR restore
Previous Message Lawrence Oluyede 2007-12-12 11:11:39 Re: Possible bug in PostgreSQL 8.3beta4 (postgres process segfaults)