From: | Heikki Linnakangas <heikki(at)enterprisedb(dot)com> |
---|---|
To: | Teodor Sigaev <teodor(at)sigaev(dot)ru> |
Cc: | Patches <pgsql-patches(at)postgresql(dot)org> |
Subject: | Re: Partial match in GIN |
Date: | 2008-04-08 19:17:00 |
Message-ID: | 47FBC4AC.8060808@enterprisedb.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-patches |
Teodor Sigaev wrote:
>> Looking at the patch, you require that the TIDBitmap fits in work_mem
>> in non-lossy format. I don't think that's acceptable, it can easily
>> exceed work_mem if you search for some very common word. Failing to
>> execute a valid query is not good.
> But way is better than nothing. In really, that way was chosen to have
> fast merge of (potentially) hundreds of sorted lists of ItemPointers.
> Other ways is much slower.
How about forcing the use of a bitmap index scan, and modify the indexam
API so that GIN could a return a lossy bitmap, and let the bitmap heap
scan do the rechecking?
>> I don't think the storage size of tsquery matters much, so whatever is
>> the best solution in terms of code readability etc.
> That was about tsqueryesend/recv format? not a storage on disk. We don't
> require compatibility of binary format of db's files, but I have some
> doubts about binary dump.
We generally don't make any promises about cross-version compatibility
of binary dumps, though it would be nice not to break it if it's not too
much effort.
>> Hmm. match_special_index_operator() already checks that the index's
>> opfamily is pattern_ops, or text_ops with C-locale. Are you reusing
>> the same operator families for wildspeed? Doesn't it then also get
>> confused if you do a "WHERE textcol > 'foo'" query by hand?
> No, wildspeed use the same operator ~~
> match_special_index_operator() isn't called at all: in
> match_clause_to_indexcol() function is_indexable_operator() is called
> before match_special_index_operator() and returns true.
>
> expand_indexqual_opclause() sees that operation is a OID_TEXT_LIKE_OP
> and calls prefix_quals() which fails because it wishes only several
> Btree opfamilies.
Oh, I see. So this assumption mentioned in the comment there:
/*
* LIKE and regex operators are not members of any index opfamily,
* so if we find one in an indexqual list we can assume that it
* was accepted by match_special_index_operator().
*/
is no longer true with wildspeed. So we do need to check that in
expand_indexqual_opclause() then.
>>> NOTICE 2: it seems to me, that similar technique could be implemented
>>> for ordinary BTree to eliminate hack around LIKE support.
>> LIKE expression. I wonder what the size and performance of that would
>> be like, in comparison to the proposed GIN solution?
>
> GIN speeds up '%foo%' too - which is impossible for btree. But I don't
> like a hack around LIKE support in BTree. This support uses outflank
> ways missing regular one.
You could satisfy '%foo%' using a regular and a reverse B-tree index,
and a bitmap AND. Which is interestingly similar to the way you proposed
to use a TIDBitmap within GIN.
--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com
From | Date | Subject | |
---|---|---|---|
Next Message | Alvaro Herrera | 2008-04-08 19:19:53 | Re: Partial match in GIN |
Previous Message | Bruce Momjian | 2008-04-08 19:10:29 | Re: [PATCHES] libpq type system 0.9a |