From: | Divyansh Gupta JNsThMAudy <ag1567827(at)gmail(dot)com> |
---|---|
To: | Ron Johnson <ronljohnsonjr(at)gmail(dot)com> |
Cc: | pgsql-general <pgsql-general(at)postgresql(dot)org> |
Subject: | Re: Need help in database design |
Date: | 2024-12-23 18:31:57 |
Message-ID: | CAHesJ5Kn4bm6DmEVtEmHc1itXovdqNo8iE4kiDunH3gYK5jF4A@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
Currently I haven't created those columns , I have created addons_json
column which is a JSONB column yet in a discussion weather I should create
or consider only one JSONB column.
On Tue, 24 Dec 2024, 12:00 am Divyansh Gupta JNsThMAudy, <
ag1567827(at)gmail(dot)com> wrote:
> Range partition can help when you applies filter for a specific range but
> in my case I need to apply filter on userid always, however I have date
> columns but there is less variation in timestamp which I have that's why
> didn't go for range partition.
>
> On Mon, 23 Dec 2024, 11:57 pm Ron Johnson, <ronljohnsonjr(at)gmail(dot)com>
> wrote:
>
>>
>> 1. I bet you'd get better performance using RANGE partitioning.
>> 2. Twenty million rows per userid is a *LOT*. No subdivisions (like
>> date range)?
>>
>> On Mon, Dec 23, 2024 at 1:23 PM Divyansh Gupta JNsThMAudy <
>> ag1567827(at)gmail(dot)com> wrote:
>>
>>> Adrian, Please check this out;
>>>
>>> PARTITION BY HASH (userid); CREATE TABLE dbo.googledocs_tbl_clone_part_0
>>> PARTITION OF dbo.googledocs_tbl_clone FOR VALUES WITH (modulus 84,
>>> remainder 0); ... CREATE TABLE dbo.googledocs_tbl_clone_part_83
>>> PARTITION OF dbo.googledocs_tbl_clone FOR VALUES WITH (modulus 84,
>>> remainder 83);
>>>
>>> On Mon, Dec 23, 2024 at 11:48 PM Divyansh Gupta JNsThMAudy <
>>> ag1567827(at)gmail(dot)com> wrote:
>>>
>>>> Adrian, the partition is on userid using hash partition with 84
>>>> partitions
>>>>
>>>> Ron, there could be more than 20 Million records possible for a single
>>>> userid in that case if I create index on userid only not on other column
>>>> the query is taking more than 30 seconds to return the results.
>>>>
>>>> On Mon, 23 Dec 2024, 11:40 pm Ron Johnson, <ronljohnsonjr(at)gmail(dot)com>
>>>> wrote:
>>>>
>>>>> If your queries all reference userid, then you only need indices on
>>>>> gdid and userid.
>>>>>
>>>>> On Mon, Dec 23, 2024 at 12:49 PM Divyansh Gupta JNsThMAudy <
>>>>> ag1567827(at)gmail(dot)com> wrote:
>>>>>
>>>>>> I have one confusion with this design if I opt to create 50 columns I
>>>>>> need to create 50 index which will work with userid index in Bitmap on the
>>>>>> other hand if I create a JSONB column I need to create a single index ?
>>>>>>
>>>>>> On Mon, 23 Dec 2024, 11:10 pm Ron Johnson, <ronljohnsonjr(at)gmail(dot)com>
>>>>>> wrote:
>>>>>>
>>>>>>> Given what you just wrote, I'd stick with 50 separate t* columns.
>>>>>>> Simplifies queries, simplifies updates, and eliminates JSONB conversions.
>>>>>>>
>>>>>>> On Mon, Dec 23, 2024 at 12:29 PM Divyansh Gupta JNsThMAudy <
>>>>>>> ag1567827(at)gmail(dot)com> wrote:
>>>>>>>
>>>>>>>> Values can be updated based on customer actions
>>>>>>>>
>>>>>>>> All rows won't have all 50 key value pairs always if I make those
>>>>>>>> keys into columns the rows might have null value on the other hand if it is
>>>>>>>> JSONB then the key value pair will not be there
>>>>>>>>
>>>>>>>> Yes in UI customers can search for the key value pairs
>>>>>>>>
>>>>>>>> During data population the key value pair will be empty array in
>>>>>>>> case of JSONB column or NULL in case of table columns, later when customer
>>>>>>>> performs some actions that time the key value pairs will populate and
>>>>>>>> update, based on what action customer performs.
>>>>>>>>
>>>>>>>> On Mon, 23 Dec 2024, 10:51 pm Divyansh Gupta JNsThMAudy, <
>>>>>>>> ag1567827(at)gmail(dot)com> wrote:
>>>>>>>>
>>>>>>>>> Let's make it more understandable, here is the table schema with
>>>>>>>>> 50 columns in it
>>>>>>>>>
>>>>>>>>> CREATE TABLE dbo.googledocs_tbl (
>>>>>>>>> gdid int8 GENERATED BY DEFAULT AS IDENTITY( INCREMENT BY 1
>>>>>>>>> MINVALUE 1 MAXVALUE 9223372036854775807 START 1 CACHE 1 NO CYCLE) NOT NULL,
>>>>>>>>> userid int8 NOT NULL,
>>>>>>>>> t1 int4 NULL,
>>>>>>>>> t2 int4 NULL,
>>>>>>>>> t3 int4 NULL,
>>>>>>>>> t4 int4 NULL,
>>>>>>>>> t5 int4 NULL,
>>>>>>>>> t6 int4 NULL,
>>>>>>>>> t7 int4 NULL,
>>>>>>>>> t8 int4 NULL,
>>>>>>>>> t9 int4 NULL,
>>>>>>>>> t10 int4 NULL,
>>>>>>>>> t11 int4 NULL,
>>>>>>>>> t12 int4 NULL,
>>>>>>>>> t13 int4 NULL,
>>>>>>>>> t14 int4 NULL,
>>>>>>>>> t15 int4 NULL,
>>>>>>>>> t16 int4 NULL,
>>>>>>>>> t17 int4 NULL,
>>>>>>>>> t18 int4 NULL,
>>>>>>>>> t19 int4 NULL,
>>>>>>>>> t20 int4 NULL,
>>>>>>>>> t21 int4 NULL,
>>>>>>>>> t22 int4 NULL,
>>>>>>>>> t23 int4 NULL,
>>>>>>>>> t24 int4 NULL,
>>>>>>>>> t25 int4 NULL,
>>>>>>>>> t26 int4 NULL,
>>>>>>>>> t27 int4 NULL,
>>>>>>>>> t28 int4 NULL,
>>>>>>>>> t29 int4 NULL,
>>>>>>>>> t30 int4 NULL,
>>>>>>>>> t31 int4 NULL,
>>>>>>>>> t32 int4 NULL,
>>>>>>>>> t33 int4 NULL,
>>>>>>>>> t34 int4 NULL,
>>>>>>>>> t35 int4 NULL,
>>>>>>>>> t36 int4 NULL,
>>>>>>>>> t37 int4 NULL,
>>>>>>>>> t38 int4 NULL,
>>>>>>>>> t39 int4 NULL,
>>>>>>>>> t40 int4 NULL,
>>>>>>>>> t41 int4 NULL,
>>>>>>>>> t42 int4 NULL,
>>>>>>>>> t43 int4 NULL,
>>>>>>>>> t44 int4 NULL,
>>>>>>>>> t45 int4 NULL,
>>>>>>>>> t46 int4 NULL,
>>>>>>>>> t47 int4 NULL,
>>>>>>>>> t48 int4 NULL,
>>>>>>>>> t49 int4 NULL,
>>>>>>>>> t50 int4 NULL,
>>>>>>>>> CONSTRAINT googledocs_tbl_pkey PRIMARY KEY (gdid),
>>>>>>>>> );
>>>>>>>>>
>>>>>>>>> Every time when i query I will query it along with userid
>>>>>>>>> Ex : where userid = 12345678 and t1 in (1,2,3) and t2 in (0,1,2)
>>>>>>>>> more key filters if customer applies
>>>>>>>>>
>>>>>>>>> On the other hand if I create a single jsonb column the schema
>>>>>>>>> will look like :
>>>>>>>>>
>>>>>>>>> CREATE TABLE dbo.googledocs_tbl (
>>>>>>>>> gdid int8 GENERATED BY DEFAULT AS IDENTITY( INCREMENT BY 1
>>>>>>>>> MINVALUE 1 MAXVALUE 9223372036854775807 START 1 CACHE 1 NO CYCLE) NOT NULL,
>>>>>>>>> userid int8 NOT NULL,
>>>>>>>>> addons_json jsonb default '{}'::jsonb
>>>>>>>>> CONSTRAINT googledocs_tbl_pkey PRIMARY KEY (gdid),
>>>>>>>>> );
>>>>>>>>>
>>>>>>>>> and the query would be like
>>>>>>>>> where userid = 12345678 and ((addons_json @> {t1:1}) or
>>>>>>>>> (addons_json @> {t1:2}) or (addons_json @> {t1:3})
>>>>>>>>> more key filters if customer applies
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Mon, Dec 23, 2024 at 10:38 PM David G. Johnston <
>>>>>>>>> david(dot)g(dot)johnston(at)gmail(dot)com> wrote:
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On Mon, Dec 23, 2024, 10:01 Divyansh Gupta JNsThMAudy <
>>>>>>>>>> ag1567827(at)gmail(dot)com> wrote:
>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> So here my question is considering one JSONB column is perfect
>>>>>>>>>>> or considering 50 columns will be more optimised.
>>>>>>>>>>>
>>>>>>>>>> The relational database engine is designed around the
>>>>>>>>>> column-based approach. Especially if the columns are generally unchanging,
>>>>>>>>>> combined with using fixed-width data types.
>>>>>>>>>>
>>>>>>>>>> David J.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> Death to <Redacted>, and butter sauce.
>>>>>>> Don't boil me, I'm still alive.
>>>>>>> <Redacted> lobster!
>>>>>>>
>>>>>>
>>>>>
>>>>> --
>>>>> Death to <Redacted>, and butter sauce.
>>>>> Don't boil me, I'm still alive.
>>>>> <Redacted> lobster!
>>>>>
>>>>
>>
>> --
>> Death to <Redacted>, and butter sauce.
>> Don't boil me, I'm still alive.
>> <Redacted> lobster!
>>
>
From | Date | Subject | |
---|---|---|---|
Next Message | Arnold Morein | 2024-12-23 18:34:52 | Re: Need help in database design |
Previous Message | Divyansh Gupta JNsThMAudy | 2024-12-23 18:30:25 | Re: Need help in database design |