Re: Design of a database table

From: Adrian Klaver <adrian(dot)klaver(at)aklaver(dot)com>
To: hmidi slim <hmidi(dot)slim2(at)gmail(dot)com>, pgsql-general(at)postgresql(dot)org
Subject: Re: Design of a database table
Date: 2018-07-30 23:25:08
Message-ID: c1115541-98bb-380b-520f-2124e9381333@aklaver.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On 07/30/2018 07:37 AM, hmidi slim wrote:
> I'm trying to design a database table. First of all there are two
> alternatives:
> 1-) Divide the table into two tables and make a join.
> 2-) Design a single table.
>
> 1rst alternative:
> Create table data_periods(
> id serial primary key not null,
> period daterange,
> project_id integer
> )
>
> create table data_periods_info(
> id serial primary key not null,
> data_periods_id integer,
> data_sub_periods daterange,
> stock1 integer,
> stock2 integer,
> CONSTRAINT data_periods_allotment_id_fkey FOREIGN KEY (data_periods_id)
>         REFERENCES data_periods (id) MATCH SIMPLE
>         ON UPDATE NO ACTION
>         ON DELETE NO ACTION
> )
>
> The table data_periods contains 1M rows and data_periods_info 5M rows.
> I added an index to the table data_periods_info for the column
> data_periods_id
> I execute this query:
> select
> data_periods.id <http://data_periods.id>,
> data_sub_periods,
> project_id,
> stock1,
> stock2
> from data_periods
> inner join data_periods_info on data_periods_info.data_periods_id =
> data_periods.id <http://data_periods.id>
> where data_periods.period && '[2018-07-28, 2018-08-02]'::daterange
> and data_sub_periods  && '[2018-07-28, 2018-08-02]'::daterange
>
> I got an execution time of : 1s 300ms
>
>
> 2nd alternative:
> create table data_periods_second(
> id serial primary key not null,
> data_sub_periods daterange,
> project_id integer,
> stock1 integer,
> stock2 integer)
>
> I run this query;
> select * from data_periods_second
> where data_sub_periods && '[2018-07-28, 2018-08-02]'::daterange
>
> I got such a execution time : 1s
>
>
> Is it normal to get an execution time when using join relation greatest
> than the execution time of a table contains million of rows and many
> columns?

Not surprising given that you are searching for a date range in two
tables in the join versus only one in the other case. That fact that you
are using the same range end points for period in data_periods and
data_sub_periods in data_periods_info, to me, points to a design flaw.
If period and data_sub_periods are the same why separate and repeat
them? Also when asking for input on query planning/outcomes running
EXPLAIN ANALYZE on the queries and posting the results here will help
arrive at answer.

--
Adrian Klaver
adrian(dot)klaver(at)aklaver(dot)com

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Ken Tanzer 2018-07-30 23:29:58 Re: Cosmetically-varying casts added to view definitions
Previous Message Andres Freund 2018-07-30 23:24:48 Re: Restore relhaspkey in PostgreSQL Version 11 Beta