From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | Robert Haas <robertmhaas(at)gmail(dot)com> |
Cc: | Jon Nelson <jnelson+pgsql(at)jamponi(dot)net>, pgsql-performance(at)postgresql(dot)org |
Subject: | Re: queries with lots of UNIONed relations |
Date: | 2011-01-13 23:05:04 |
Message-ID: | 18301.1294959904@sss.pgh.pa.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-performance |
Robert Haas <robertmhaas(at)gmail(dot)com> writes:
> On Thu, Jan 13, 2011 at 5:26 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> I don't believe there is any case where hashing each individual relation
>> is a win compared to hashing them all together. If the optimizer were
>> smart enough to be considering the situation as a whole, it would always
>> do the latter.
> You might be right, but I'm not sure. Suppose that there are 100
> inheritance children, and each has 10,000 distinct values, but none of
> them are common between the tables. In that situation, de-duplicating
> each individual table requires a hash table that can hold 10,000
> entries. But deduplicating everything at once requires a hash table
> that can hold 1,000,000 entries.
> Or am I all wet?
If you have enough memory to de-dup them individually, you surely have
enough to de-dup all at once. It is not possible for a single hashtable
to have worse memory consumption than N hashtables followed by a union
hashtable, and in fact if there are no common values then the latter eats
twice as much space because every value appears twice in two different
hashtables.
regards, tom lane
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2011-01-13 23:07:25 | Re: queries with lots of UNIONed relations |
Previous Message | Jon Nelson | 2011-01-13 22:53:22 | Re: queries with lots of UNIONed relations |