From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | Fujii Masao <masao(dot)fujii(at)gmail(dot)com> |
Cc: | PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Useless removal of duplicate GIN index entries in pg_trgm |
Date: | 2012-08-27 19:38:11 |
Message-ID: | 7688.1346096291@sss.pgh.pa.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Fujii Masao <masao(dot)fujii(at)gmail(dot)com> writes:
> After pg_trgm extracts the trigrams as GIN index keys, generate_trgm()
> removes duplicate index keys, to avoid generating redundant index entries.
> Also ginExtractEntries() which is the caller of pg_trgm does the same thing.
> Why do we need to remove GIN index entries twice? I think that we can
> get rid of the removal-of-duplicate code block from generate_trgm()
> because it's useless. Comments?
I see eight different callers of generate_trgm(). It might be that
gin_extract_value_trgm() doesn't really need this behavior, but that
doesn't mean the other seven don't want it.
Also, seeing that generate_trgm() is able to use relatively cheap
trigram-specific comparison operators for this, it's not impossible
that getting rid of duplicates internal to it is a net savings even
for the gin_extract_value case, because it'd reduce the number of
much-more-heavyweight comparisons done by ginExtractEntries...
regards, tom lane
From | Date | Subject | |
---|---|---|---|
Next Message | Bruce Momjian | 2012-08-27 19:38:20 | Re: wal_buffers |
Previous Message | Dean Rasheed | 2012-08-27 19:35:00 | Re: Optimize referential integrity checks (todo item) |