From: | Alexander Korotkov <aekorotkov(at)gmail(dot)com> |
---|---|
To: | pgsql-hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | WIP: store additional info in GIN index |
Date: | 2012-11-18 21:54:53 |
Message-ID: | CAPpHfdtSt47PpRQBK6OawHePLJk8PF-wNhswaUpre7_+cc_kmA@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hackers,
Attached patch enables GIN to store additional information with item
pointers in posting lists and trees.
Such additional information could be positions of words, positions of
trigrams, lengths of arrays and so on.
This is the first and most huge patch of serie of GIN improvements which
was presented at PGConf.EU
http://wiki.postgresql.org/images/2/25/Full-text_search_in_PostgreSQL_in_milliseconds-extended-version.pdf
Patch modifies GIN interface as following:
1) Two arguments are added to extractValue
Datum **addInfo, bool **addInfoIsNull
2) Two arguments are added to consistent
Datum addInfo[], bool addInfoIsNull[]
3) New method config is introduced which returns datatype oid of addtional
information (analogy with SP-GiST config method).
Patch completely changes storage in posting lists and leaf pages of posting
trees. It uses varbyte encoding for BlockNumber and OffsetNumber.
BlockNumber are stored incremental in page. Additionally one bit of
OffsetNumber is reserved for additional information NULL flag. To be able
to find position in leaf data page quickly patch introduces small index in
the end of page.
------
With best regards,
Alexander Korotkov.
Attachment | Content-Type | Size |
---|---|---|
ginaddinfo.1.patch.gz | application/x-gzip | 31.3 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Tomas Vondra | 2012-11-18 22:49:19 | Re: autovacuum stress-testing our system |
Previous Message | Andres Freund | 2012-11-18 20:39:37 | Re: Avoiding overflow in timeout-related calculations |