Re: Query is slower with a large proportion of NULLs in several columns

From: "David G(dot) Johnston" <david(dot)g(dot)johnston(at)gmail(dot)com>
To: Lars Bergeson <larsavatar(at)gmail(dot)com>
Cc: "pgsql-performance(at)lists(dot)postgresql(dot)org" <pgsql-performance(at)lists(dot)postgresql(dot)org>
Subject: Re: Query is slower with a large proportion of NULLs in several columns
Date: 2021-12-21 01:49:00
Message-ID: CAKFQuwZP8KUnLLukbfnxp7uqE5q-8vkOZcZGDhdXxZ0u1eOHpg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On Monday, December 20, 2021, Lars Bergeson <larsavatar(at)gmail(dot)com> wrote:

>
> What is it about null values in the table that slows down the full table
> scan?
>
> If I populate blank/zero for all of the unused values in columns that are
> NULLable, the query is fast again. So just defining the columns as NULLable
> isn't what slows it down -- it's actually the NULL values in the rows that
> seems to degrade performance.
>

The presence or absence of the constraint has zero effect on the contents
of the page/tuple. As soon as you have a single null in a row you are
adding a null bitmap [1] to the stored tuple. And now for every single
column the system has to check whether a specific column’s value is null or
not. Given the number of columns in your table, that this is noticeable is
not surprising.

David J.

[1] https://www.postgresql.org/docs/current/storage-page-layout.html

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message Tom Lane 2021-12-21 01:51:59 Re: Query is slower with a large proportion of NULLs in several columns
Previous Message Lars Bergeson 2021-12-21 01:23:54 Query is slower with a large proportion of NULLs in several columns