Re: JOIN column maximum

From: Lee Hachadoorian <lee(dot)hachadoorian(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-general <pgsql-general(at)postgresql(dot)org>
Subject: Re: JOIN column maximum
Date: 2012-01-06 01:10:17
Message-ID: 4F0649F9.7070209@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On 01/05/2012 06:18 PM, Tom Lane wrote:
>
>> ERROR: joins can have at most 32767 columns
> It's the sum of the number of columns in the base tables.
>
That makes sense. I totally misunderstood the message to be referring to
the number of joined columns rather than table columns.
>
>> I've asked this list before for advice on how to work with the
>> approximately 23,000 column American Community Survey dataset,
> Are there really 23000 populated values in each row? I hesitate to
> suggest an EAV approach, but it kinda seems like you need to go in that
> direction. You're never going to get decent performance out of a schema
> that requires 100-way joins, even if you avoid bumping up against hard
> limits.
Many of the smaller geographies, e.g. census tracts, do in fact have
data for the vast majority of the columns. I am trying to combine it all
into one table to avoid the slowness of multiple JOINs (even though in
practice I'm never joining all the tables at once). EAV sounds correct
in terms of normalization, but isn't it usually better performance-wise
to store write-once/read-many data in a denormalized (i.e. flattened)
fashion? One of these days I'll have to try to benchmark some different
approaches, but for now planning on using array columns, with each
"sequence" (in the Census sense, not the Postgres sense) of 200+
variables in its own array rather than its own table.

--Lee

--
Lee Hachadoorian
PhD, Earth& Environmental Sciences (Geography)
Research Associate, CUNY Center for Urban Research
http://freecity.commons.gc.cuny.edu

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Scott Marlowe 2012-01-06 02:22:57 Re: JOIN column maximum
Previous Message Kevin Duffy 2012-01-05 23:32:58 function return update count