From: | Alexander Korotkov <aekorotkov(at)gmail(dot)com> |
---|---|
To: | Robert Haas <robertmhaas(at)gmail(dot)com> |
Cc: | pgsql-hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Proposal: q-gram GIN and GiST indexes |
Date: | 2011-04-05 13:39:15 |
Message-ID: | BANLkTi=vY-MG_eyyONEqPj4uy3fOY=P0qg@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Tue, Apr 5, 2011 at 5:05 PM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> I am probably being stupid here, but doesn't the number of links to
> rows grow proportionately to the number of n-grams?
Number of links to rows grow proportionally to total number of extracted
q-grams, but not proportionally to number of unique q-grams. Though, if
extracted q-grams are not unique inside same indexed value, then it can
reduce number of links (but it is rarity).
Lets consider simple example. Two rows contains strings 'aaa' and 'aaab'. We
extract 3-gram 'aaa' from first string and 3-grams 'aaa' and 'aab' from
second string (for simplicity, there is no padding here). GIN index will
contain structure, which can be represented so:
'aaa' => 1, 2
'aab' => 2
We can see, that there are 2 unique 3-grams, but 3 links to the rows.
----
With best regards,
Alexander Korotkov.
From | Date | Subject | |
---|---|---|---|
Next Message | Robert Haas | 2011-04-05 13:44:44 | Re: Typed-tables patch broke pg_upgrade |
Previous Message | Robert Haas | 2011-04-05 13:37:51 | Re: cast from integer to money |