Re: Compare rows

From: Greg Spiegelberg <gspiegelberg(at)cranel(dot)com>
To: PgSQL Performance ML <pgsql-performance(at)postgresql(dot)org>
Subject: Re: Compare rows
Date: 2003-10-08 18:05:34
Message-ID: 3F8451EE.2050006@cranel.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

See below.

Shridhar Daithankar wrote:
> Greg Spiegelberg wrote:
>
>> The data represents metrics at a point in time on a system for
>> network, disk, memory, bus, controller, and so-on. Rx, Tx, errors,
>> speed, and whatever else can be gathered.
>>
>> We arrived at this one 642 column table after testing the whole
>> process from data gathering, methods of temporarily storing then
>> loading to the database. Initially, 37+ tables were in use but
>> the one big-un has saved us over 3.4 minutes.
>
>
> I am sure you changed the desing because those 3.4 minutes were
> significant to you.
>
>
> But I suggest you go back to 37 table design and see where bottleneck
> is. Probably you can tune a join across 37 tables much better than
> optimizing a difference between two 637 column rows.

The bottleneck is across the board.

On the data collection side I'd have to manage 37 different methods
and output formats whereas now I have 1 standard associative array
that gets reset in memory for each "row" stored.

On the data validation side, I have one routine to check the incoming
data for errors, missing columns, data types and so on. Quick & easy.

On the data import it's easier and more efficient to do one COPY for
a standard format from one program instead of multiple programs or
COPY's. We were using 37 PHP scripts to handle the import and the
time it took to load, execute, exit, reload each script was killing
us. Now, 1 PHP and 1 COPY.

> Besides such a large number of columns will cost heavily in terms of
> defragmentation across pages. The wasted space and IO therof could be
> significant issue for large number of rows.

No arguement here.

> 642 column is a bad design. Theoretically and from implementation of
> postgresql point of view. You did it because of speed problem. Now if we
> can resolve those speed problems, perhaps you could go back to other
> design.
>
> Is it feasible for you right now or you are too much committed to the
> big table?

Pretty commited though I do try to be open.

Greg

--
Greg Spiegelberg
Sr. Product Development Engineer
Cranel, Incorporated.
Phone: 614.318.4314
Fax: 614.431.8388
Email: gspiegelberg(at)Cranel(dot)com
Cranel. Technology. Integrity. Focus.

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message Josh Berkus 2003-10-08 18:05:47 Re: PostgreSQL vs. MySQL
Previous Message Bruce Momjian 2003-10-08 17:58:57 Re: PostgreSQL vs. MySQL