Quick Links

tables with lots of columns - what alternative from performance point of view?

From:	hubert depesz lubaczewski <depesz(at)gmail(dot)com>
To:	PostgreSQL General <pgsql-general(at)postgresql(dot)org>
Subject:	tables with lots of columns - what alternative from performance point of view?
Date:	2005-12-07 07:42:38
Message-ID:	9e4684ce0512062342l75878299t33190a20c3f29c47@mail.gmail.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-general

hi
jus recently there were some thread on postgresql list with people asying :
i have 700 columns, i have 1000 columns and so on.
some people, imediatelly responded: change your schema.
this is what forced to me ask:

i have a situation where i ahve to store a number of "objects" in database.
all objects have 3 specific attributes (which go into objects table), and
may have a lot of "custom fields".
basically - lsit of accessible custom fields for object depends on which
object-category this object belongs to.
now.
i know, i could have written it in this way:

create table object_custom_fields (id serial primary key, object_id int8,
field_id int8, field_value text);
but:
this approach has two very big drawbacks (for me):
1. the table cannot differentiate between custom fields of type "date",
"number" and so on. - everything is stored as text.
2. it is rather slow. i have to do a non-unique index scan over
object_custom_fields, get all records, and pivot it (on the client side of
curse) to make it usable.

i did it differently, definitelly not nicely, but i dont see any other way
to get this performance with unknown list of custom fields:
1. create table cf_types (id serial, codename text, representation text);
2. create table cf_definitions (id serial, category_id int8, type_id int8,
field-number int4);
3. create table cf_values (id serial, object_id int8 (unique),
...................................................);

now.
in cf_definitions i specify, category, field_type_id, and a field-number -
which relates to _NUMBER in fields in cf_values.

what i did achive is *very* fast retrieval of data for any given object.
the schema of cf_values table is absolutelly awful, and i will never say
differently.
my point is - if somebody (tom lane for example) says - redesign your schema
- whenever he reads about table with 700 column (i have more :) - then i
must have missed something absolutelyl simple, fast and elegant. what is
this?

depesz

Responses

Re: tables with lots of columns - what alternative from at 2005-12-07 08:47:22 from Oleg Bartunov

Browse pgsql-general by date

	From	Date	Subject
Next Message	A. Kretschmer	2005-12-07 07:47:02	Re: Delete Question
Previous Message	Emil Rachovsky	2005-12-07 07:39:23	Re: [SQL] lost in system tables