Coccinelle for PostgreSQL development [1/N]: coccicheck.py

From: Mats Kindahl <mats(at)timescale(dot)com>
To: PostgreSQL Development <pgsql-hackers(at)postgresql(dot)org>
Subject: Coccinelle for PostgreSQL development [1/N]: coccicheck.py
Date: 2025-01-07 19:44:55
Message-ID: CA+14426e8dbmMjGLu8jO8CQAb9-FKiM-CQhvQHUB=3OnJwWpzQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

I got some time over during the holidays, so I spent some of it
doing something I've been thinking about for a while.

For those of you that are not aware of it: Coccinelle is a tool for pattern
matching and text transformation for C code and can be used for detection
of problematic programming patterns and to make complex, tree-wide patches
easy. It is aware of the structure of C code and is better suited to make
complicated changes than what is possible using normal text substitution
tools like Sed and Perl.

Coccinelle have been successfully been used in the Linux project since 2008
and is now an established tool for Linux development and a large number of
semantic patches have been added to the source tree to capture everything
from generic issues (like eliminating the redundant A in expressions like
"!A || (A && B)") to more Linux-specific problems like adding a missing
call to kfree().

Although PostgreSQL is nowhere the size of the Linux kernel, it is
nevertheless of a significant size and would benefit from incorporating
Coccinelle into the development. I noticed it's been used in a few cases
way back (like 10 years back) to fix issues in the PostgreSQL code, but I
thought it might be useful to make it part of normal development practice
to, among other things:

- Identify and correct bugs in the source code both during development and
review.
- Make large-scale changes to the source tree to improve the code based on
new insights.
- Encode and enforce APIs by ensuring that function calls are used
correctly.
- Use improved coding patterns for more efficient code.
- Allow extensions to automatically update code for later PostgreSQL
versions.

To that end, I created a series of patches to show how it could be used in
the PostgreSQL tree. It is a lot easier to discuss concrete code and I
split it up into separate messages since that makes it easier to discuss
each individual patch. The series contains code to make it easy to work
with Coccinelle during development and reviews, as well as examples of
semantic patches that capture problems, demonstrate how to make large-scale
changes, how to enforce APIs, and also improve some coding patterns.

This first patch contains the coccicheck.py script, which is a
re-implementation of the coccicheck script that the Linux kernel uses. We
cannot immediately use the coccicheck script since it is quite closely tied
to the Linux source code tree and we need to have something that both
supports autoconf and Meson. Since Python seems to be used more and more in
the tree, it seems to be the most natural choice. (I have no strong opinion
on what language to use, but think it would be good to have something that
is as platform-independent as possible.)

The intention is that we should be able to use the Linux semantic patches
directly, so it supports the "Requires" and "Options" keywords, which can
be used to require a specific version of spatch(1) and add options to the
execution of that semantic patch, respectively.
--
Best wishes,
Mats Kindahl, Timescale

Attachment Content-Type Size
0001-Add-initial-coccicheck-script.v1.patch text/x-patch 7.1 KB

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Mats Kindahl 2025-01-07 19:45:54 Coccinelle for PostgreSQL development [2/N]: autoconf support
Previous Message Sami Imseih 2025-01-07 19:29:20 Re: Sample rate added to pg_stat_statements