From: | Rj Ewing <ewing(dot)rj(at)gmail(dot)com> |
---|---|
To: | Jeff Janes <jeff(dot)janes(at)gmail(dot)com> |
Cc: | Bruce Momjian <bruce(at)momjian(dot)us>, PostgreSQL General <pgsql-general(at)postgresql(dot)org> |
Subject: | Re: full text search on hstore or json with materialized view? |
Date: | 2017-04-20 15:15:54 |
Message-ID: | CAOSSsV2mtCnbhUm4DGn8-1aAvPJTShBg9ZMpAiaD1F-Jnx0Mzg@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
On Wed, Apr 19, 2017 at 8:09 PM, Jeff Janes <jeff(dot)janes(at)gmail(dot)com> wrote:
>
> Your best bet might be to ignore the per-field searching in the initial
> (indexed) pass of the query to get everything that has all the search
> terms, regardless of which field they occur in. And the re-check whether
> each of the found values was found in the appropriate field in a later pass.
>
> Something like
>
> select * from sample where
> to_tsvector(json_thing->>:key1) @@ :value1
> and to_tsvector(json_thing->>:key2) @@ :value2
> and to_tsvector('english',json_thing) @@ (:value1 || :value2)
>
that worked pretty well when there was an AND condition with multiple k:v
pairs as you have. However replacing it with an OR condition across k:v
pairs it was pretty slow. I do like the simplicity though. Maybe indexing
the 10ish most common columns would be a "good enough" solution.
>
> From the initial email:
>
> > An idea that has come up is to use a materialized view or secondary
> table with triggers, where we would have 3 columns (id, key, value).
>
> How would this be different from the "triple store" you are abandoning?
>
it would be fairly similar. One advantage would be that we could simplify
the backend to just a RDMS (which we use already), and not have to maintain
a separate "triple store" instance
From | Date | Subject | |
---|---|---|---|
Next Message | Rj Ewing | 2017-04-20 15:50:31 | Re: full text search on hstore or json with materialized view? |
Previous Message | Scott Marlowe | 2017-04-20 15:01:12 | Re: Large data and slow queries |