From: | Marco Colli <collimarco91(at)gmail(dot)com> |
---|---|
To: | Nick Cleaton <nick(at)cleaton(dot)net> |
Cc: | pgsql-performance(at)lists(dot)postgresql(dot)org |
Subject: | Re: Index for range queries on JSON (user defined fields) |
Date: | 2020-12-05 10:50:09 |
Message-ID: | CAFvCgN4fZc4ZGw5z-y_xbj9sJrK9nMNgzDiUkXBDQQuunZgstQ@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-performance |
Thanks for the suggestion: I had already considered that solution (first
link), but the fear is having to JOIN large tables with hundreds of
millions of records.
For my understanding **using JOIN when dealing with big data is bad and a
nightmare for performance**: can you confirm? Or am I missing something?
That tables would be frequently read and updated and are the core of the
application: that also means that every update on a user would produce
**many dead rows** - not just 1 user row, as in the case of JSON, but many
rows in the user metadata table.
On Fri, Dec 4, 2020 at 11:40 PM Nick Cleaton <nick(at)cleaton(dot)net> wrote:
> On Fri, 4 Dec 2020 at 15:39, Marco Colli <collimarco91(at)gmail(dot)com> wrote:
>
>> Hello!
>>
>> We have a multi-tenant service where each customer has millions of users
>> (total: ~150M rows). Now we would like to let each customer define some
>> custom columns for his users and then let the customer search his users
>> efficiently based on these columns.
>>
>> This problem seems really hard to solve with PostgreSQL:
>>
>> https://stackoverflow.com/questions/5106335/how-to-design-a-database-for-user-defined-fields
>>
>> In particular the easiest way would be to add a JSON field on the users
>> table (e.g. user metadata). However the PostgreSQL GIN index only supports
>> exact matches and not range queries. This means that a query on a range
>> (e.g. age > 30) would be extremely inefficient and would result in a table
>> scan.
>>
>
> You could have a table of (tenant, customer, setting_name, setting_value)
> so that a btree index on (tenant, setting_name, setting_value) would work
> for "select customer from my_table where tenant=$1 and setting_name='age'
> and setting_value > 30"
>
> That doesn't deal with setting values having a variety of types, but you
> could have a distinct user defined settings table for each setting value
> type that you want to support.
>
>
From | Date | Subject | |
---|---|---|---|
Next Message | Atul Kumar | 2020-12-11 08:12:40 | "Required checkpoints occurs too frequently" |
Previous Message | Nick Cleaton | 2020-12-04 22:39:45 | Re: Index for range queries on JSON (user defined fields) |