From: | Jeff Janes <jeff(dot)janes(at)gmail(dot)com> |
---|---|
To: | Rob Imig <rimig88(at)gmail(dot)com> |
Cc: | "pgsql-performance(at)postgresql(dot)org" <pgsql-performance(at)postgresql(dot)org> |
Subject: | Re: Performant queries on table with many boolean columns |
Date: | 2016-04-21 16:36:37 |
Message-ID: | CAMkU=1yiaPZAuePwbD_=GS-JZ7r5hzpc9FhZbMX2DEBqTU=BRw@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-performance |
On Wed, Apr 20, 2016 at 11:41 AM, Rob Imig <rimig88(at)gmail(dot)com> wrote:
> Hey all,
>
> New to the lists so please let me know if this isn't the right place for
> this question.
>
> I am trying to understand how to structure a table to allow for optimal
> performance on retrieval. The data will not change frequently so you can
> basically think of it as static and only concerned about optimizing reads
> from basic SELECT...WHERE queries.
>
> The data:
>
> ~20 million records
> Each record has 1 id and ~100 boolean properties
> Each boolean property has ~85% of the records as true
>
>
> The retrieval will always be something like "SELECT id FROM <table> WHERE
> <conditions>.
>
> <conditions> will be some arbitrary set of the ~100 boolean columns and you
> want the ids that match all of the conditions (true for each boolean
> column). Example:
> WHERE prop1 AND prop18 AND prop24
Is 3 a typical number of conditions to have?
85%^3 is 61.4%, so you are fetching most of the table. At that point,
I think I would give up on indexes and just expect to do a full table
scan each time. Which means a single column
bit-string data type might be the way to go, although the construction
of the queries would then be more cumbersome, especially if you will
do by hand.
I think the only way to know for sure is to write a few scripts to benchmark it.
Cheers,
Jeff
From | Date | Subject | |
---|---|---|---|
Next Message | Jeff Janes | 2016-04-21 16:45:56 | Re: Performant queries on table with many boolean columns |
Previous Message | David G. Johnston | 2016-04-21 16:12:49 | Re: Performant queries on table with many boolean columns |