Re: 500rows = 1min/2.5k rows=20min/6K rows 2 hours and still running

From: Ow Mun Heng <Ow(dot)Mun(dot)Heng(at)wdc(dot)com>
To: Nis Jørgensen <nis(at)superlativ(dot)dk>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: 500rows = 1min/2.5k rows=20min/6K rows 2 hours and still running
Date: 2007-09-14 06:30:27
Message-ID: 1189751427.22514.78.camel@neuromancer.home.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On Wed, 2007-09-12 at 15:14 +0200, Nis Jørgensen wrote:
> El-Lotso skrev:
>
> > I'm on the verge of giving up... the schema seems simple and yet there's
> > so much issues with it. Perhaps it's the layout of the data, I don't
> > know. But based on the ordering/normalisation of the data and the one to
> > many relationship of some tables, this is giving the planner a headache
> > (and me a bulge on the head from knockin it against the wall)
>
> I think you should look more at the db design, and less on rewriting the
> query. Here are some observations:

I can't help much with the design per-se. So..

>
> - Your table structure is quite hard to understand (possibly because you
> have changed the names) - if you want help on improving it, you will
> need to explain the data to us, and possibly post some sample data.

If anyone is willing, I can send some sample data to you off-list.

on the trh table, hid is a subset of data for a particular id.

eg:
PARENT : CHILD 1
PARENT : CHILD 2
PARENT : CHILD 3
PARENT : CHILD 4

uniqueid = merged fields from id / index1 / index2 / start_timestamp(IN EPOCH)
/ phase_id / ttype which is unique on each table (but not across ALL the tables)

> - You seem to be lacking constraints on the tables. My guess is that
> (id,ttype,start_timestamp) is unique in both trh and ts - but I cannot
> tell (and neither can the query planner). Foreign key constraints might
> help as well. These would also help others to understand your data, and
> suggest reformulations of your queries.

AFAICT, there are no foreign constraints in the original DB design. (and
I'm not even sure how to begin the FK design based on this org design)

the unique_id is as above.
TRH/TRD uniqueid = merged fields from id / index1 / index2 /
start_timestamp(IN EPOCH) / phase_id / ttype

TS uniqueid = merged fields from id / start_timestamp(IN EPOCH) / ttype

Problem with this is that the fields in which they are unique is
different across the different tables, so the unique_id is only unique
for that table alone and acts as a primary key so that no dupes exists
in that one table.

> - Another guess is that the ttype sets (177,197,705,742,758,766),
> (69,178,198,704,757,741,765) are actually indicating some other property
> a common "type" of record, and that only one of each will be present for
> an id,start_timestamp combination. This may be related to the repeatingd
> fields issue - if a certain ttype indicates that we are interested in a
> certain pber_x field (and possibly that the others are empty).

yes..

eg:
id | hid |ttype | start_timestamp | pber_2 | pber 3 |pber_4
PARENT | 0 |764 | 2007-07-01 00:00 | 4000 | null | null
PARENT | 0 |765 | 2007-07-01 00:00 | null | 9000 | null
PARENT | 0 |766 | 2007-07-01 00:00 | null | null | 7999
PARENT | 1 |764 | 2007-07-01 00:00 | 4550 | null | null
PARENT | 1 |765 | 2007-07-01 00:00 | null | 9220 | null
PARENT | 1 |766 | 2007-07-01 00:00 | null | null | 6669

the subqueries are just to take out the fields with the value and leave
the nulls so that we end-up with

id |hid| start_timestamp |pber_2 | pber 3 | pber_4
PARENT | 0 | 2007-07-01 00:00 | 4000 | 9000 | 7999
PARENT | 1 | 2007-07-01 00:00 | 4550 | 9220 | 6669

which is basically just joining a table by itself, but there is a caveat
whereby pber_3 and pber_4 is/can only be joined together based on the
seq_date/seq_time in the ts table hence the query..

JOIN1.id = join2.id
and join1.seq_date = join2.seq_date
etc..

but the problem is confounded by the fact that there is numerous hid
values for head id

> - You have what looks like repeating fields - pber_x, fval_x, index_x -
> in your tables. Fixing this might not improve your query, but might be a
> good idea for other reasons.

it's being looked at by some other team to collapse this to something
like this

ttype | pber
764 | 500
765 | 600
766 | 700

so that there are lesser # of columns and no null fields. But the query
will remain the same

> - seq_date and seq_time seems like they may be redundant - are they
> different casts of the same data?

No. They're used to join together the pber_2/3/4 fields as one may
happen between a few hours to days between each other, but each will be
uniquely identified by the seq_date/time

eg :

id | pber_2 | seq_date | seq time
PARENT | 400 | 2007-07-01 00:00:00 | 1980-01-01 20:00:00
PARENT | 410 | 2007-07-10 00:00:00 | 1980-01-01 22:00:00

id | pber_3 | seq_date | seq time
PARENT | 900 | 2007-07-01 00:00:00 | 1980-01-01 20:00:00
PARENT | 100 | 2007-07-10 00:00:00 | 1980-01-01 22:00:00

id | pber_4 | seq_date | seq time
PARENT | 10000 | 2007-07-01 00:00:00 | 1980-01-01 20:00:00
PARENT | 999 | 2007-07-10 00:00:00 | 1980-01-01 22:00:00

so, the correct value for the fields when joined together will be of the
form

id |start_timestamp |seq_date | seq_time |pber_2 | pber 3 | pber_4
PARENT |2007-07-01 00:00 |2007-07-01 00:00:00 | 1980-01-01 20:00:00| 400 | 900 | 10000
PARENT |2007-07-01 00:00 |2007-07-10 00:00:00 | 1980-01-01 22:00:00| 410 | 100 | 999

(repeating for each hid subset value)

> All speculation. Hope it helps

anything would help.. I'm more or less willing to try anything to make
things faster else this project is going to the toilet.

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message Peter Childs 2007-09-14 07:02:23 Re: Long Running Commits - Not Checkpoints
Previous Message El-Lotso 2007-09-14 05:51:41 Re: Re: 500rows = 1min/2.5k rows=20min/6K rows 2 hours and still running