Re: deb package sizes

From: Álvaro Hernández <aht(at)ongres(dot)com>
To: Magnus Hagander <magnus(at)hagander(dot)net>
Cc: Jeremy Schneider <schneider(at)ardentperf(dot)com>, Christoph Berg <myon(at)debian(dot)org>, pgsql-pkg-debian(at)lists(dot)postgresql(dot)org
Subject: Re: deb package sizes
Date: 2025-01-10 11:32:47
Message-ID: 46ef3419-5a67-4d41-95c2-1cb5db8f4d5a@ongres.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-pkg-debian

On 10/1/25 10:52, Magnus Hagander wrote:
> On Thu, Jan 9, 2025 at 11:40 PM Álvaro Hernández <aht(at)ongres(dot)com> wrote:
>
>
>
> On 9/1/25 18:08, Jeremy Schneider wrote:
>> On Thu, 9 Jan 2025 17:06:57 +0100
>> Álvaro Hernández<aht(at)ongres(dot)com> <mailto:aht(at)ongres(dot)com> wrote:
>>
>>> On 9/1/25 10:07, Christoph Berg wrote:
>>>> Re: Jeremy Schneider
>>>>> I'm wondering if there might be any support for providing a
>>>>> "postgresql-slim" package on PGDG which excludes llvm and python? I
>>>>> think this might almost cut the total install size in half, and I
>>>>> think there might be many users who would value having the option.
>>>>>
>>>> Hi,
>>>>
>>>> could you explain why 250 MB is too much? Disk space these days is
>>>> ultra cheap
>>>     Hi Christoph.
>>>
>>>     Container images allow (are meant to) contain only the necessary
>>> files needed to run the process that will be run when the image is
>>> run. As such, any additional file poses two main problems:
>>>
>>> * Disk space is cheap. Bandwidth not so much. Time to start a
>>>
>>> * Security analysis. Unneeded files (specially binaries, but not
>> Another concern is the impact of image rebuilds as dependencies are
>> updated. Tianon (a primary maintainer of the docker images) has noted
>> that they limit frequency of the debian base containers, because every
>> rebuild of the base container triggers an avalance of downstream
>> rebuilds. CNPG was doing daily rebuilds for awhile, and every time any
>> python dependency was updated you'd get a new image - boto3 was
>> notorious for very frequent updates. So with a different image version
>> for every day, a single server running multiple copies of postgres might
>> easily end up with multiple image versions on the server as copies are
>> slowly updated.
>
>     I see this as a symptom of a different, bigger issue: that
> package versions, and all transitive dependencies, should be
> version pinned when building container images. I haven't seen too
> many examples of taking the effort to do this. But it's the only
> way to have a way to re-run building images and guarantee outputs
> that are reproducible. Once you have this in place, you can decide
> how and when you upgrade which versions.
>
>
> I'm guessing most container builders are just not interested in doing
> that much work. It's easier to just "always upgrade", but as noted
> that comes with a whole different set of problems. It's only really
> feasible if you manage to first reduce the set of dependencies
> substantially.

    Yes, it comes with a whole set of problems. The main one, other
than upgrades, is that you may end up with inconsistent environments:
cases where not all images deployed are the same because some
dependencies have different versions. This may also lead to different
CVEs present on different servers. This if far from ideal and a problem
that is starting to be more and more visible.

    While container builders may not be interested in doing all this
work, I think that it should be done regardless. And over time, it will
be done more and more. When security and supply-chain attacks are a
serious concern, precise knowledge of your dependencies is key.

>
>
>     Actually, even version pinning is not enough, unless the
> package system guarantees that a version of a package is strictly
> immutable (and AFAIK this is usually not the case). So digest
> pinning is essentially required.
>
>
> Debian (as this was talking about it) is actually doing a very good
> job ot that these days, though they're not there all the way. But
> https://tests.reproducible-builds.org/debian/reproducible.htmlshows
> they're doing really well.

    Debian is doing a great job towards reproducibility of the build
efforts of their packages. However, AFAIK a given package version can be
updated with a different content --and that's why a service like
https://snapshot.debian.org exists.

    Álvaro

--

Alvaro Hernandez

-----------
OnGres

In response to

Responses

Browse pgsql-pkg-debian by date

  From Date Subject
Next Message Christoph Berg 2025-01-10 12:17:17 Re: deb package sizes
Previous Message Magnus Hagander 2025-01-10 09:52:44 Re: deb package sizes