SELECT query results are different depending on whether table statistics are available.

From: James Brauman <james(dot)brauman(at)envato(dot)com>
To: pgsql-general(at)lists(dot)postgresql(dot)org
Subject: SELECT query results are different depending on whether table statistics are available.
Date: 2020-05-28 03:09:23
Message-ID: CAFCW2QOWsFZW=hCnzjidyttkDXda1qCgWk+=ms=xq0Z=qJJMug@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

I've ran into a bit of a head scratching situation and was hoping that
someone with more knowledge that I could help me understand the
behaviour I'm seeing.

I'm running on PostgreSQL 12.2.

I have a SELECT query that returns different results depending on
whether statistics for the table have been collected or not.The query
uses several CTEs and returns a single integer. This integer changes
depending on whether the table has been analyzed.

As far as I can tell I am not using any 'volatile' functions in my SELECT query.

It took me a while to find a way to reproduce the issue. How I
eventually reproduced it was:

-- Delete all statistics.
DELETE FROM pg_statistic;

-- Truncate table and insert values into table.
TRUNCATE TABLE target_table;
INSERT INTO target_table (...)
VALUES
(...);

-- The results of the SELECT are different depending on whether
ANALYZE is called.
ANALYZE target_table;

-- Run select query (involving several CTEs).
SELECT ...;

I haven't generated a minimal test case yet, but I did notice that if
all CTEs in the SELECT query are defined using AS NOT MATERIALIZED the
results are always the same regardless of whether the table has been
ANALYZED yet.

Could anyone share knowledge about why this is happening?

Thanks,
James Brauman

Responses

Browse pgsql-general by date

  From Date Subject
Next Message David G. Johnston 2020-05-28 03:13:47 Re: SELECT query results are different depending on whether table statistics are available.
Previous Message Michel Pelletier 2020-05-27 21:50:25 Re: GPG signing