Please improve "Limitations" section for GIN indexes

From: Marcin Barczynski <mbarczynski(at)starfishstorage(dot)com>
To: pgsql-docs(at)lists(dot)postgresql(dot)org
Subject: Please improve "Limitations" section for GIN indexes
Date: 2023-03-13 08:41:06
Message-ID: CAOhG4we40pPe-53n9LW8cYY31Ng7XSiLde_nZjbxRj0aQ_EX9w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-docs

We've learned the hard way that keys are never removed in GIN indexes,
leading to infinite index size growth in certain scenarios. The PostgreSQL
documentation doesn't mention it even in the "Limitations" section for GIN
indexes (https://www.postgresql.org/docs/15/gin-limit.html)

The only place I found info about the unusual behavior is README file in
the source code:
https://github.com/postgres/postgres/blob/master/src/backend/access/gin/README#L391
.

We keep a tree-like structure in the database and use GIN indexes to find
descendants of a node quickly. The simplified table looks as follows:

node_id | ancestor_ids
--------|-------------
1 | []
2 | [1]
15 | [1, 2]

node_id is a sequence, and nodes are often added and removed from the
table: today node_ids are in the range from 1 to 100K but next month it
will be 500K-600K. Because GIN index never removes entry keys, it will
contain keys from 1 to 600K. In our case the GIN index became 2.5 times
larger than the table - that's the hard-way part.

How about sparing others the hard-way part and explicitly mentioning the
unusual behavior in the "Limitations" section for GIN indexes?

Best regards,
Marcin Barczyński

Browse pgsql-docs by date

  From Date Subject
Next Message Tomas Pospisek 2023-03-13 09:00:36 Re: Make SSPI documentation clearer
Previous Message Stephen Frost 2023-03-13 00:36:53 Re: Make SSPI documentation clearer