Quick Links

hashed crosstab

From:	Joe Conway <mail(at)joeconway(dot)com>
To:	"Patches (PostgreSQL)" <pgsql-patches(at)postgresql(dot)org>
Subject:	hashed crosstab
Date:	2003-03-03 04:27:44
Message-ID:	3E62D9C0.6050207@joeconway.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-patches

Attached is an update to contrib/tablefunc. It implements a new hashed
version of crosstab. This fixes a major deficiency in real-world use of
the original version. Easiest to undestand with an illustration:

Notice that the original crosstab slides data over to the left in the
result tuple when it encounters missing data. In order to work around
this you have to be make your source sql do all sorts of contortions
(cartesian join of distinct rowid with distinct attribute; left join
that back to the real source data). The new version avoids this by
building a hash table using a second distinct attribute query.

The new version also allows for "extra" columns (see the README) and
allows the result columns to be coerced into differing datatypes if they
are suitable (as shown above).

In testing a "real-world" data set (69 distinct rowid's, 27 distinct
categories/attributes, multiple missing data points) I saw about a
5-fold improvement in execution time (from about 2200 ms old, to 440 ms
new).

I left the original version intact because: 1) BC, 2) it is probably
slightly faster if you know that you have no missing attributes.

README and regression test adjustments included. If there are no
objections, please apply.

Thanks,

Joe

Attachment	Content-Type	Size
tablefunc-ct_hash.1.patch	text/plain	29.0 KB

Responses

Re: hashed crosstab at 2003-03-18 00:26:15 from Bruce Momjian
Re: hashed crosstab at 2003-03-20 06:46:27 from Bruce Momjian

Browse pgsql-patches by date

	From	Date	Subject
Next Message	Rod Taylor	2003-03-03 04:34:23	ALTER SEQUENCE
Previous Message	Dmitry Tkach	2003-02-28 21:56:23	Re: postgres error reporting