Re: Need help in database design

From: Ron Johnson <ronljohnsonjr(at)gmail(dot)com>
To: pgsql-general <pgsql-general(at)postgresql(dot)org>
Subject: Re: Need help in database design
Date: 2024-12-23 18:27:25
Message-ID: CANzqJaDb901c=fbicfuzXu1kvf1OR+6rtRZvtHOPyGaOD2E99Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

1. I bet you'd get better performance using RANGE partitioning.
2. Twenty million rows per userid is a *LOT*. No subdivisions (like date
range)?

On Mon, Dec 23, 2024 at 1:23 PM Divyansh Gupta JNsThMAudy <
ag1567827(at)gmail(dot)com> wrote:

> Adrian, Please check this out;
>
> PARTITION BY HASH (userid); CREATE TABLE dbo.googledocs_tbl_clone_part_0
> PARTITION OF dbo.googledocs_tbl_clone FOR VALUES WITH (modulus 84,
> remainder 0); ... CREATE TABLE dbo.googledocs_tbl_clone_part_83 PARTITION
> OF dbo.googledocs_tbl_clone FOR VALUES WITH (modulus 84, remainder 83);
>
> On Mon, Dec 23, 2024 at 11:48 PM Divyansh Gupta JNsThMAudy <
> ag1567827(at)gmail(dot)com> wrote:
>
>> Adrian, the partition is on userid using hash partition with 84 partitions
>>
>> Ron, there could be more than 20 Million records possible for a single
>> userid in that case if I create index on userid only not on other column
>> the query is taking more than 30 seconds to return the results.
>>
>> On Mon, 23 Dec 2024, 11:40 pm Ron Johnson, <ronljohnsonjr(at)gmail(dot)com>
>> wrote:
>>
>>> If your queries all reference userid, then you only need indices on gdid
>>> and userid.
>>>
>>> On Mon, Dec 23, 2024 at 12:49 PM Divyansh Gupta JNsThMAudy <
>>> ag1567827(at)gmail(dot)com> wrote:
>>>
>>>> I have one confusion with this design if I opt to create 50 columns I
>>>> need to create 50 index which will work with userid index in Bitmap on the
>>>> other hand if I create a JSONB column I need to create a single index ?
>>>>
>>>> On Mon, 23 Dec 2024, 11:10 pm Ron Johnson, <ronljohnsonjr(at)gmail(dot)com>
>>>> wrote:
>>>>
>>>>> Given what you just wrote, I'd stick with 50 separate t* columns.
>>>>> Simplifies queries, simplifies updates, and eliminates JSONB conversions.
>>>>>
>>>>> On Mon, Dec 23, 2024 at 12:29 PM Divyansh Gupta JNsThMAudy <
>>>>> ag1567827(at)gmail(dot)com> wrote:
>>>>>
>>>>>> Values can be updated based on customer actions
>>>>>>
>>>>>> All rows won't have all 50 key value pairs always if I make those
>>>>>> keys into columns the rows might have null value on the other hand if it is
>>>>>> JSONB then the key value pair will not be there
>>>>>>
>>>>>> Yes in UI customers can search for the key value pairs
>>>>>>
>>>>>> During data population the key value pair will be empty array in case
>>>>>> of JSONB column or NULL in case of table columns, later when customer
>>>>>> performs some actions that time the key value pairs will populate and
>>>>>> update, based on what action customer performs.
>>>>>>
>>>>>> On Mon, 23 Dec 2024, 10:51 pm Divyansh Gupta JNsThMAudy, <
>>>>>> ag1567827(at)gmail(dot)com> wrote:
>>>>>>
>>>>>>> Let's make it more understandable, here is the table schema with 50
>>>>>>> columns in it
>>>>>>>
>>>>>>> CREATE TABLE dbo.googledocs_tbl (
>>>>>>> gdid int8 GENERATED BY DEFAULT AS IDENTITY( INCREMENT BY 1 MINVALUE
>>>>>>> 1 MAXVALUE 9223372036854775807 START 1 CACHE 1 NO CYCLE) NOT NULL,
>>>>>>> userid int8 NOT NULL,
>>>>>>> t1 int4 NULL,
>>>>>>> t2 int4 NULL,
>>>>>>> t3 int4 NULL,
>>>>>>> t4 int4 NULL,
>>>>>>> t5 int4 NULL,
>>>>>>> t6 int4 NULL,
>>>>>>> t7 int4 NULL,
>>>>>>> t8 int4 NULL,
>>>>>>> t9 int4 NULL,
>>>>>>> t10 int4 NULL,
>>>>>>> t11 int4 NULL,
>>>>>>> t12 int4 NULL,
>>>>>>> t13 int4 NULL,
>>>>>>> t14 int4 NULL,
>>>>>>> t15 int4 NULL,
>>>>>>> t16 int4 NULL,
>>>>>>> t17 int4 NULL,
>>>>>>> t18 int4 NULL,
>>>>>>> t19 int4 NULL,
>>>>>>> t20 int4 NULL,
>>>>>>> t21 int4 NULL,
>>>>>>> t22 int4 NULL,
>>>>>>> t23 int4 NULL,
>>>>>>> t24 int4 NULL,
>>>>>>> t25 int4 NULL,
>>>>>>> t26 int4 NULL,
>>>>>>> t27 int4 NULL,
>>>>>>> t28 int4 NULL,
>>>>>>> t29 int4 NULL,
>>>>>>> t30 int4 NULL,
>>>>>>> t31 int4 NULL,
>>>>>>> t32 int4 NULL,
>>>>>>> t33 int4 NULL,
>>>>>>> t34 int4 NULL,
>>>>>>> t35 int4 NULL,
>>>>>>> t36 int4 NULL,
>>>>>>> t37 int4 NULL,
>>>>>>> t38 int4 NULL,
>>>>>>> t39 int4 NULL,
>>>>>>> t40 int4 NULL,
>>>>>>> t41 int4 NULL,
>>>>>>> t42 int4 NULL,
>>>>>>> t43 int4 NULL,
>>>>>>> t44 int4 NULL,
>>>>>>> t45 int4 NULL,
>>>>>>> t46 int4 NULL,
>>>>>>> t47 int4 NULL,
>>>>>>> t48 int4 NULL,
>>>>>>> t49 int4 NULL,
>>>>>>> t50 int4 NULL,
>>>>>>> CONSTRAINT googledocs_tbl_pkey PRIMARY KEY (gdid),
>>>>>>> );
>>>>>>>
>>>>>>> Every time when i query I will query it along with userid
>>>>>>> Ex : where userid = 12345678 and t1 in (1,2,3) and t2 in (0,1,2)
>>>>>>> more key filters if customer applies
>>>>>>>
>>>>>>> On the other hand if I create a single jsonb column the schema will
>>>>>>> look like :
>>>>>>>
>>>>>>> CREATE TABLE dbo.googledocs_tbl (
>>>>>>> gdid int8 GENERATED BY DEFAULT AS IDENTITY( INCREMENT BY 1 MINVALUE
>>>>>>> 1 MAXVALUE 9223372036854775807 START 1 CACHE 1 NO CYCLE) NOT NULL,
>>>>>>> userid int8 NOT NULL,
>>>>>>> addons_json jsonb default '{}'::jsonb
>>>>>>> CONSTRAINT googledocs_tbl_pkey PRIMARY KEY (gdid),
>>>>>>> );
>>>>>>>
>>>>>>> and the query would be like
>>>>>>> where userid = 12345678 and ((addons_json @> {t1:1}) or
>>>>>>> (addons_json @> {t1:2}) or (addons_json @> {t1:3})
>>>>>>> more key filters if customer applies
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Mon, Dec 23, 2024 at 10:38 PM David G. Johnston <
>>>>>>> david(dot)g(dot)johnston(at)gmail(dot)com> wrote:
>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On Mon, Dec 23, 2024, 10:01 Divyansh Gupta JNsThMAudy <
>>>>>>>> ag1567827(at)gmail(dot)com> wrote:
>>>>>>>>
>>>>>>>>>
>>>>>>>>> So here my question is considering one JSONB column is perfect or
>>>>>>>>> considering 50 columns will be more optimised.
>>>>>>>>>
>>>>>>>> The relational database engine is designed around the column-based
>>>>>>>> approach. Especially if the columns are generally unchanging, combined
>>>>>>>> with using fixed-width data types.
>>>>>>>>
>>>>>>>> David J.
>>>>>>>>
>>>>>>>>
>>>>>
>>>>> --
>>>>> Death to <Redacted>, and butter sauce.
>>>>> Don't boil me, I'm still alive.
>>>>> <Redacted> lobster!
>>>>>
>>>>
>>>
>>> --
>>> Death to <Redacted>, and butter sauce.
>>> Don't boil me, I'm still alive.
>>> <Redacted> lobster!
>>>
>>

--
Death to <Redacted>, and butter sauce.
Don't boil me, I'm still alive.
<Redacted> lobster!

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Ron Johnson 2024-12-23 18:29:51 Re: Need help in database design
Previous Message Divyansh Gupta JNsThMAudy 2024-12-23 18:26:28 Re: Need help in database design