Re: Row estimates for empty tables

From: Justin Pryzby <pryzby(at)telsasoft(dot)com>
To: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
Cc: Christophe Pettus <xof(at)thebuild(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, David Rowley <dgrowleyml(at)gmail(dot)com>, postgre <pgsql-general(at)lists(dot)postgresql(dot)org>
Subject: Re: Row estimates for empty tables
Date: 2020-08-25 00:06:25
Message-ID: 20200825000625.GF31273@telsasoft.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general pgsql-hackers

On Fri, Jul 24, 2020 at 09:14:04PM +0200, Pavel Stehule wrote:
> pá 24. 7. 2020 v 16:38 odesílatel Christophe Pettus <xof(at)thebuild(dot)com> napsal:
> > Since we already special-case parent tables for partition sets, would a
> > storage parameter that lets you either tell the planner "no, really, zero
> > is reasonable here" or sets a minimum number of rows to plan for be
> > reasonable? I happened to get bit by this tracking down an issue where
> > several tables in a large query had zero rows, and the planner's assumption
> > of a few pages worth caused some sub-optimal plans. The performance hit
> > wasn't huge, but they were being joined to some *very* large tables, and
> > the differences added up.
>
> I did this patch ten years ago. GoodData application
> https://www.gooddata.com/ uses Postgres lot, and this application stores
> some results in tables (as guard against repeated calculations). Lot of
> these tables have zero or one row.
>
> Although we ran an ANALYZE over all tables - the queries on empty tables
> had very bad plans, and I had to fix it by this patch. Another company uses
> a fake one row in table - so there is no possibility to have a really empty
> table.
>
> It is an issue for special, not typical applications (this situation is
> typical for some OLAP patterns) - it is not too often - but some clean
> solution (instead hacking postgres) can be nice.

On Mon, Aug 24, 2020 at 09:43:49PM +0200, Pavel Stehule wrote:
> This patch is just a workaround that works well 10 years (but for one
> special use case) - nothing more. Without this patch that application
> cannot work ever.

My own workaround was here:
https://www.postgresql.org/message-id/20200427181034.GA28974@telsasoft.com
|... 1) create an child table: CREATE TABLE x_child() INHERITS(x)
|and, 2) change the query to use "select from ONLY".
|
|(1) allows the planner to believe that the table really is empty, a conclusion
|it otherwise avoids and (2) avoids decending into the child (for which the
|planner would likewise avoid the conclusion that it's actually empty).

--
Justin

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Keisuke Kuroda 2020-08-25 06:50:17 Re: Creating many tables gets logical replication stuck
Previous Message David Rowley 2020-08-24 22:26:49 Re: Query plan prefers hash join when nested loop is much faster

Browse pgsql-hackers by date

  From Date Subject
Next Message Bruce Momjian 2020-08-25 00:26:12 Re: file_fdw vs relative paths
Previous Message Bruce Momjian 2020-08-25 00:01:26 Re: "cert" + clientcert=verify-ca in pg_hba.conf?