Re: queries with lots of UNIONed relations

From: Mladen Gogala <mladen(dot)gogala(at)vmsinfo(dot)com>
To: pgsql-performance(at)postgresql(dot)org
Subject: Re: queries with lots of UNIONed relations
Date: 2011-01-14 03:19:19
Message-ID: 4D2FC0B7.30705@vmsinfo.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On 1/13/2011 5:41 PM, Robert Haas wrote:
> You might be right, but I'm not sure. Suppose that there are 100
> inheritance children, and each has 10,000 distinct values, but none of
> them are common between the tables. In that situation, de-duplicating
> each individual table requires a hash table that can hold 10,000
> entries. But deduplicating everything at once requires a hash table
> that can hold 1,000,000 entries.
>
> Or am I all wet?
>

Have you considered using Google's map-reduce framework for things like
that? Union and group functions look like ideal candidates for such a
thing. I am not sure whether map-reduce can be married to a relational
database, but I must say that I was impressed with the speed of MongoDB.
I am not suggesting that PostgreSQL should sacrifice its ACID compliance
for speed, but Mongo sure does look like a speeding bullet.
On the other hand, the algorithms that have been paralleled for a long
time are precisely sort/merge and hash algorithms used for union and
group by functions. This is what I have in mind:
http://labs.google.com/papers/mapreduce.html

--
Mladen Gogala
Sr. Oracle DBA
1500 Broadway
New York, NY 10036
(212) 329-5251
www.vmsinfo.com

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message Craig Ringer 2011-01-14 06:02:28 Re: The good, old times
Previous Message Tom Lane 2011-01-14 00:10:27 Re: queries with lots of UNIONed relations