From: | Divyansh Gupta JNsThMAudy <ag1567827(at)gmail(dot)com> |
---|---|
To: | Ron Johnson <ronljohnsonjr(at)gmail(dot)com> |
Cc: | pgsql-general <pgsql-general(at)postgresql(dot)org> |
Subject: | Re: Need help in database design |
Date: | 2024-12-23 18:30:25 |
Message-ID: | CAHesJ5LbFjqN0NndYUHKsXX_JssgwM-VPPaYoi28kZvsciFifQ@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
Range partition can help when you applies filter for a specific range but
in my case I need to apply filter on userid always, however I have date
columns but there is less variation in timestamp which I have that's why
didn't go for range partition.
On Mon, 23 Dec 2024, 11:57 pm Ron Johnson, <ronljohnsonjr(at)gmail(dot)com> wrote:
>
> 1. I bet you'd get better performance using RANGE partitioning.
> 2. Twenty million rows per userid is a *LOT*. No subdivisions (like date
> range)?
>
> On Mon, Dec 23, 2024 at 1:23 PM Divyansh Gupta JNsThMAudy <
> ag1567827(at)gmail(dot)com> wrote:
>
>> Adrian, Please check this out;
>>
>> PARTITION BY HASH (userid); CREATE TABLE dbo.googledocs_tbl_clone_part_0
>> PARTITION OF dbo.googledocs_tbl_clone FOR VALUES WITH (modulus 84,
>> remainder 0); ... CREATE TABLE dbo.googledocs_tbl_clone_part_83 PARTITION
>> OF dbo.googledocs_tbl_clone FOR VALUES WITH (modulus 84, remainder 83);
>>
>> On Mon, Dec 23, 2024 at 11:48 PM Divyansh Gupta JNsThMAudy <
>> ag1567827(at)gmail(dot)com> wrote:
>>
>>> Adrian, the partition is on userid using hash partition with 84
>>> partitions
>>>
>>> Ron, there could be more than 20 Million records possible for a single
>>> userid in that case if I create index on userid only not on other column
>>> the query is taking more than 30 seconds to return the results.
>>>
>>> On Mon, 23 Dec 2024, 11:40 pm Ron Johnson, <ronljohnsonjr(at)gmail(dot)com>
>>> wrote:
>>>
>>>> If your queries all reference userid, then you only need indices on
>>>> gdid and userid.
>>>>
>>>> On Mon, Dec 23, 2024 at 12:49 PM Divyansh Gupta JNsThMAudy <
>>>> ag1567827(at)gmail(dot)com> wrote:
>>>>
>>>>> I have one confusion with this design if I opt to create 50 columns I
>>>>> need to create 50 index which will work with userid index in Bitmap on the
>>>>> other hand if I create a JSONB column I need to create a single index ?
>>>>>
>>>>> On Mon, 23 Dec 2024, 11:10 pm Ron Johnson, <ronljohnsonjr(at)gmail(dot)com>
>>>>> wrote:
>>>>>
>>>>>> Given what you just wrote, I'd stick with 50 separate t* columns.
>>>>>> Simplifies queries, simplifies updates, and eliminates JSONB conversions.
>>>>>>
>>>>>> On Mon, Dec 23, 2024 at 12:29 PM Divyansh Gupta JNsThMAudy <
>>>>>> ag1567827(at)gmail(dot)com> wrote:
>>>>>>
>>>>>>> Values can be updated based on customer actions
>>>>>>>
>>>>>>> All rows won't have all 50 key value pairs always if I make those
>>>>>>> keys into columns the rows might have null value on the other hand if it is
>>>>>>> JSONB then the key value pair will not be there
>>>>>>>
>>>>>>> Yes in UI customers can search for the key value pairs
>>>>>>>
>>>>>>> During data population the key value pair will be empty array in
>>>>>>> case of JSONB column or NULL in case of table columns, later when customer
>>>>>>> performs some actions that time the key value pairs will populate and
>>>>>>> update, based on what action customer performs.
>>>>>>>
>>>>>>> On Mon, 23 Dec 2024, 10:51 pm Divyansh Gupta JNsThMAudy, <
>>>>>>> ag1567827(at)gmail(dot)com> wrote:
>>>>>>>
>>>>>>>> Let's make it more understandable, here is the table schema with 50
>>>>>>>> columns in it
>>>>>>>>
>>>>>>>> CREATE TABLE dbo.googledocs_tbl (
>>>>>>>> gdid int8 GENERATED BY DEFAULT AS IDENTITY( INCREMENT BY 1 MINVALUE
>>>>>>>> 1 MAXVALUE 9223372036854775807 START 1 CACHE 1 NO CYCLE) NOT NULL,
>>>>>>>> userid int8 NOT NULL,
>>>>>>>> t1 int4 NULL,
>>>>>>>> t2 int4 NULL,
>>>>>>>> t3 int4 NULL,
>>>>>>>> t4 int4 NULL,
>>>>>>>> t5 int4 NULL,
>>>>>>>> t6 int4 NULL,
>>>>>>>> t7 int4 NULL,
>>>>>>>> t8 int4 NULL,
>>>>>>>> t9 int4 NULL,
>>>>>>>> t10 int4 NULL,
>>>>>>>> t11 int4 NULL,
>>>>>>>> t12 int4 NULL,
>>>>>>>> t13 int4 NULL,
>>>>>>>> t14 int4 NULL,
>>>>>>>> t15 int4 NULL,
>>>>>>>> t16 int4 NULL,
>>>>>>>> t17 int4 NULL,
>>>>>>>> t18 int4 NULL,
>>>>>>>> t19 int4 NULL,
>>>>>>>> t20 int4 NULL,
>>>>>>>> t21 int4 NULL,
>>>>>>>> t22 int4 NULL,
>>>>>>>> t23 int4 NULL,
>>>>>>>> t24 int4 NULL,
>>>>>>>> t25 int4 NULL,
>>>>>>>> t26 int4 NULL,
>>>>>>>> t27 int4 NULL,
>>>>>>>> t28 int4 NULL,
>>>>>>>> t29 int4 NULL,
>>>>>>>> t30 int4 NULL,
>>>>>>>> t31 int4 NULL,
>>>>>>>> t32 int4 NULL,
>>>>>>>> t33 int4 NULL,
>>>>>>>> t34 int4 NULL,
>>>>>>>> t35 int4 NULL,
>>>>>>>> t36 int4 NULL,
>>>>>>>> t37 int4 NULL,
>>>>>>>> t38 int4 NULL,
>>>>>>>> t39 int4 NULL,
>>>>>>>> t40 int4 NULL,
>>>>>>>> t41 int4 NULL,
>>>>>>>> t42 int4 NULL,
>>>>>>>> t43 int4 NULL,
>>>>>>>> t44 int4 NULL,
>>>>>>>> t45 int4 NULL,
>>>>>>>> t46 int4 NULL,
>>>>>>>> t47 int4 NULL,
>>>>>>>> t48 int4 NULL,
>>>>>>>> t49 int4 NULL,
>>>>>>>> t50 int4 NULL,
>>>>>>>> CONSTRAINT googledocs_tbl_pkey PRIMARY KEY (gdid),
>>>>>>>> );
>>>>>>>>
>>>>>>>> Every time when i query I will query it along with userid
>>>>>>>> Ex : where userid = 12345678 and t1 in (1,2,3) and t2 in (0,1,2)
>>>>>>>> more key filters if customer applies
>>>>>>>>
>>>>>>>> On the other hand if I create a single jsonb column the schema will
>>>>>>>> look like :
>>>>>>>>
>>>>>>>> CREATE TABLE dbo.googledocs_tbl (
>>>>>>>> gdid int8 GENERATED BY DEFAULT AS IDENTITY( INCREMENT BY 1 MINVALUE
>>>>>>>> 1 MAXVALUE 9223372036854775807 START 1 CACHE 1 NO CYCLE) NOT NULL,
>>>>>>>> userid int8 NOT NULL,
>>>>>>>> addons_json jsonb default '{}'::jsonb
>>>>>>>> CONSTRAINT googledocs_tbl_pkey PRIMARY KEY (gdid),
>>>>>>>> );
>>>>>>>>
>>>>>>>> and the query would be like
>>>>>>>> where userid = 12345678 and ((addons_json @> {t1:1}) or
>>>>>>>> (addons_json @> {t1:2}) or (addons_json @> {t1:3})
>>>>>>>> more key filters if customer applies
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On Mon, Dec 23, 2024 at 10:38 PM David G. Johnston <
>>>>>>>> david(dot)g(dot)johnston(at)gmail(dot)com> wrote:
>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Mon, Dec 23, 2024, 10:01 Divyansh Gupta JNsThMAudy <
>>>>>>>>> ag1567827(at)gmail(dot)com> wrote:
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> So here my question is considering one JSONB column is perfect or
>>>>>>>>>> considering 50 columns will be more optimised.
>>>>>>>>>>
>>>>>>>>> The relational database engine is designed around the column-based
>>>>>>>>> approach. Especially if the columns are generally unchanging, combined
>>>>>>>>> with using fixed-width data types.
>>>>>>>>>
>>>>>>>>> David J.
>>>>>>>>>
>>>>>>>>>
>>>>>>
>>>>>> --
>>>>>> Death to <Redacted>, and butter sauce.
>>>>>> Don't boil me, I'm still alive.
>>>>>> <Redacted> lobster!
>>>>>>
>>>>>
>>>>
>>>> --
>>>> Death to <Redacted>, and butter sauce.
>>>> Don't boil me, I'm still alive.
>>>> <Redacted> lobster!
>>>>
>>>
>
> --
> Death to <Redacted>, and butter sauce.
> Don't boil me, I'm still alive.
> <Redacted> lobster!
>
From | Date | Subject | |
---|---|---|---|
Next Message | Divyansh Gupta JNsThMAudy | 2024-12-23 18:31:57 | Re: Need help in database design |
Previous Message | Ron Johnson | 2024-12-23 18:29:51 | Re: Need help in database design |