Quick Links

Re: Key/Value reference table generation: INSERT/UPDATE performance

From:	PFC <lists(at)peufeu(dot)com>
To:	valgog <valgog(at)gmail(dot)com>, pgsql-performance(at)postgresql(dot)org
Subject:	Re: Key/Value reference table generation: INSERT/UPDATE performance
Date:	2007-05-22 10:14:48
Message-ID:	op.tsp32yfscigqcu@apollo13
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-performance

On Tue, 22 May 2007 10:23:03 +0200, valgog <valgog(at)gmail(dot)com> wrote:

> I found several post about INSERT/UPDATE performance in this group,
> but actually it was not really what I am searching an answer for...
>
> I have a simple reference table WORD_COUNTS that contains the count of
> words that appear in a word array storage in another table.

Mmm.

If I were you, I would :

- Create a procedure that flattens all the arrays and returns all the
words :

PROCEDURE flatten_arrays RETURNS SETOF TEXT
FOR word_array IN SELECT word_array FROM your_table LOOP
FOR i IN 1...array_upper( word_array ) LOOP
RETURN NEXT tolower( word_array[ i ] )

So, SELECT * FROM flatten_arrays() returns all the words in all the arrays.
To get the counts quickly I'd do this :

SELECT word, count(*) FROM flatten_arrays() AS word GROUP BY word

You can then populate your counts table very easily and quickly, since
it's just a seq scan and hash aggregate. One second for 10.000 rows would
be slow.

In response to

Key/Value reference table generation: INSERT/UPDATE performance at 2007-05-22 08:23:03 from valgog

Responses

Re: Key/Value reference table generation: INSERT/UPDATE performance at 2007-05-22 10:35:29 from valgog

Browse pgsql-performance by date

	From	Date	Subject
Next Message	valgog	2007-05-22 10:35:29	Re: Key/Value reference table generation: INSERT/UPDATE performance
Previous Message	PFC	2007-05-22 10:10:03	Re: Postgres Benchmark Results