Re: deb package sizes

From: Cédric Villemain <Cedric(dot)Villemain(at)Data-Bene(dot)io>
To: Magnus Hagander <magnus(at)hagander(dot)net>, Álvaro Hernández <aht(at)ongres(dot)com>
Cc: Jeremy Schneider <schneider(at)ardentperf(dot)com>, Christoph Berg <myon(at)debian(dot)org>, pgsql-pkg-debian(at)lists(dot)postgresql(dot)org
Subject: Re: deb package sizes
Date: 2025-01-21 09:26:00
Message-ID: 964e72f1-89aa-4512-9d7a-ec722165d291@Data-Bene.io
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-pkg-debian


On 10/01/2025 10:52, Magnus Hagander wrote:
> On Thu, Jan 9, 2025 at 11:40 PM Álvaro Hernández <aht(at)ongres(dot)com> wrote:
>
>
>
> On 9/1/25 18:08, Jeremy Schneider wrote:
>> On Thu, 9 Jan 2025 17:06:57 +0100
>> Álvaro Hernández<aht(at)ongres(dot)com> <mailto:aht(at)ongres(dot)com> wrote:
>>
>>> On 9/1/25 10:07, Christoph Berg wrote:
>>>> Re: Jeremy Schneider
>>>>> I'm wondering if there might be any support for providing a
>>>>> "postgresql-slim" package on PGDG which excludes llvm and python? I
>>>>> think this might almost cut the total install size in half, and I
>>>>> think there might be many users who would value having the option.
>>>>>
>>>> Hi,
>>>>
>>>> could you explain why 250 MB is too much? Disk space these days is
>>>> ultra cheap
>>>     Hi Christoph.
>>>
>>>     Container images allow (are meant to) contain only the necessary
>>> files needed to run the process that will be run when the image is
>>> run. As such, any additional file poses two main problems:
>>>
>>> * Disk space is cheap. Bandwidth not so much. Time to start a
>>>
>>> * Security analysis. Unneeded files (specially binaries, but not
>> Another concern is the impact of image rebuilds as dependencies are
>> updated. Tianon (a primary maintainer of the docker images) has noted
>> that they limit frequency of the debian base containers, because every
>> rebuild of the base container triggers an avalance of downstream
>> rebuilds. CNPG was doing daily rebuilds for awhile, and every time any
>> python dependency was updated you'd get a new image - boto3 was
>> notorious for very frequent updates. So with a different image version
>> for every day, a single server running multiple copies of postgres might
>> easily end up with multiple image versions on the server as copies are
>> slowly updated.
>
>     I see this as a symptom of a different, bigger issue: that
> package versions, and all transitive dependencies, should be
> version pinned when building container images. I haven't seen too
> many examples of taking the effort to do this. But it's the only
> way to have a way to re-run building images and guarantee outputs
> that are reproducible. Once you have this in place, you can decide
> how and when you upgrade which versions.
>
>
> I'm guessing most container builders are just not interested in doing
> that much work. It's easier to just "always upgrade", but as noted
> that comes with a whole different set of problems. It's only really
> feasible if you manage to first reduce the set of dependencies
> substantially.
>
>
>     Actually, even version pinning is not enough, unless the
> package system guarantees that a version of a package is strictly
> immutable (and AFAIK this is usually not the case). So digest
> pinning is essentially required.
>
>
> Debian (as this was talking about it) is actually doing a very good
> job ot that these days, though they're not there all the way. But
> https://tests.reproducible-builds.org/debian/reproducible.htmlshows
> they're doing really well.

Also on debian.net : https://amd64.reproduce.debian.net/#postgresql-17
for "non fancy" webpage.

There was a talk on this very topic, at minidebconf recently (by kpcyrd):

https://toulouse2024.mini.debconf.org/talks/4-reproducible-builds-rebuilding-what-is-distributed-from-ftpdebianorg/

"Since about a month we’ve also been rebuilding trying to exactly match
the builds being distributed via ftp.d.o - this talk will describe the
setup and the lessons learned so far, and why the results currently are
what they are (spoiler: less <30% reproducible) and what we can do to
fix that."

And rebuilderd is surely of interest for people willing to work on
reproducible builds: https://github.com/kpcyrd/rebuilderd

---
Cédric Villemain +33 6 20 30 22 52
https://www.Data-Bene.io
PostgreSQL Support, Expertise, Training, R&D

In response to

Browse pgsql-pkg-debian by date

  From Date Subject
Next Message Bradford Boyle 2025-01-22 06:47:43 Re: PgBouncer 1.24.0 - New upstream version
Previous Message Brian Cosgrove 2025-01-21 06:11:16 Re: PgBouncer 1.24.0 - New upstream version