Re: Bitmap and-ing between btree and gin?

From: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
To: Jordi <jmaillists(at)promani(dot)be>
Cc: "pgsql-performance(at)postgresql(dot)org" <pgsql-performance(at)postgresql(dot)org>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Subject: Re: Bitmap and-ing between btree and gin?
Date: 2016-02-08 18:13:46
Message-ID: CAMkU=1wvo6OQyskSzDG8K1Y3AVe_mUhy-QgK7CgbEZLSHSSibQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On Thu, Feb 4, 2016 at 9:19 AM, Jordi <jmaillists(at)promani(dot)be> wrote:

The custom here is to respond in line, not to top-post. Thanks.

>
> So basically you're saying it's hard to do sorting in any way when a gin
> index is involved? Neither with a complete multi-column btree_gin index
> because it doesn't support sorting per definition, nor with a seperate gin
> and btree because there would be an extra post-sorting step involved over
> the FULL resultset (because of the LIMIT).

In principle there is no reason (that I can think of) that a normal
btree index range scan couldn't accept a bitmap as an optional input,
and then use that as a filter which would allow it to walk the index
in order while throwing out tuples that can't match the other
conditions. You are not the first person who would benefit from such
a feature. But it would certainly not be trivial to implement. It is
not on anyone's to-do list as far as I know.

From your earlier email:

> BUT: when I remove the ORDER BY statement, the query runs really fast. It uses the 2 indexes seperately and bitmap-ands them together, resulting in a fast executing query.

When you removed the ORDER BY, did you also remove the LIMIT? If you
removed the ORDER BY and kept the LIMIT, that is pretty much a
meaningless comparison. You are asking a much easier question at that
point.

> Then would you have any hint on how to implement pagination when doing full
> text search?
> Cause in theory, if I gave it a id>100 LIMIT 100, it might just as well
> return me results 150 to 250, instead of 100 to 200...

Can you use a method that maintains state (cursor with fetching, or
temporary storage) so that it doesn't have to recalculate the query
for each page?

Cheers,

Jeff

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message Jeff Janes 2016-02-08 19:16:00 Re: gin performance issue.
Previous Message Merlin Moncure 2016-02-08 14:35:58 Re: bad COPY performance with NOTIFY in a trigger