Re: Whether to back-patch fix for aggregate transtype width estimates

From: Greg Stark <stark(at)mit(dot)edu>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Whether to back-patch fix for aggregate transtype width estimates
Date: 2016-06-19 02:12:40
Message-ID: CAM-w4HPchTTZS6+Z-WOi_THNhdR=H4YNYuZgRrAi3BEDTorPCw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sat, Jun 18, 2016 at 5:14 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> . In 9.x, that's broken and it falls back to
> get_typavgwidth's default guess of 32 bytes. If what you've actually
> got is, say, varchar(255) and most of the entries actually approach
> that length, this could result in a drastic underestimate, possibly
> leading to OOM from hash table growth.

This seems more likely to result in the converse. 32 bytes is enough
distinct values I imagine it's going to avoid a hash join. (and In any
case if you have a varchar(n) where n>32 then it's probably a bad bet
to assume n gives much information about the typical length of the
strings). On the other hand if what you've actually got is a
varchar(1) or something like that then indeed a hash join might have
been a good choice.

--
greg

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Eisentraut 2016-06-19 03:34:54 Re: parallel.c is not marked as test covered
Previous Message David Fetter 2016-06-19 01:20:51 Re: 10.0