Re: Unique values across a table of arrays - documents and tags

From: François Beausoleil <francois(at)teksol(dot)info>
To: Ivan Voras <ivoras(at)freebsd(dot)org>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: Unique values across a table of arrays - documents and tags
Date: 2012-11-07 15:34:21
Message-ID: 370680F7-B322-4241-8C23-8EA528EFE159@teksol.info
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Le 2012-11-07 à 10:21, Ivan Voras a écrit :

>
> This is unfortunately slow (because I know the load will increase and
> this will be a common operation).
>
> The thing I was planning to do is create a separate table, with only the
> unique tags, and possibly an array of documents which have these tags,
> which will be maintained with UPDATE and INSERT triggers on the
> documents table, but then I remembered that the GIN index itself does
> something not unlike this method. Is there a way to make use of this
> information to get a list of unique tags?
>
> Barring that, what would you suggest for efficiently handing a classic
> structure like this (meaning documents with tags)?
>

Can you structure it as the "classic" many to many pattern:

documents <-> taggings <-> tags

Unique tags then becomes a plain seq scan on a smallish table (tags). To keep the ability to have a single field, you can hide the documents table behind a view that would do an array_agg, such as:

SELECT documents.*, array_agg(taggings.tag)
FROM documents JOIN tags ON tags.document_id = documents.id
GROUP BY documents.*

Not sure we can do GROUP BY documents.*, but if not, you list your columns individually.

Hope that helps!
François

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Ivan Voras 2012-11-07 15:38:43 Re: Unique values across a table of arrays - documents and tags
Previous Message aasat 2012-11-07 15:23:32 Re: Query completed in < 1s in PG 9.1 and ~ 700s in PG 9.2