|
From: | Jacob Bachmeyer |
Subject: | Re: GNU Coding Standards, automake, and the recent xz-utils backdoor |
Date: | Sun, 31 Mar 2024 02:17:25 -0500 |
User-agent: | Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.8.1.22) Gecko/20090807 MultiZilla/1.8.3.4e SeaMonkey/1.1.17 Mnenhy/0.7.6.0 |
Eric Gallager wrote:
Specifically, what caught my attention was how the release tarball containing the backdoor didn't match the history of the project in its git repository. That made me think about automake's `distcheck` target, whose entire purpose is to make it easier to verify that a distribution tarball can be rebuilt from itself and contains all the things it ought to contain.
The problem is that a release tarball is a freestanding object, with no dependency on the repository from which it was produced. In this case, the attacker added a bogus "update" of build-to-host.m4 from gnulib to the release tarball, but that file is not stored in the Git repository. This would not have tripped "make distcheck" because the crocked tarball can indeed be used to rebuild another crocked tarball.
As Alexandre Oliva mentioned in his reply, there is not really any good way to prevent this, since the attacker could also patch the generated configure script more directly. (I seem to remember past incidents where tampered release tarballs had configure scripts that would download and run shell scripts. If you ran configure as root, well...) The *user* could catch issues like this backdoor, since the backdoor appears (based on what I have read so far) to materialize certain object files while configure is running, while `find . -iname '*.o'` /should/ return nothing before make is run. This also suggests that running "make clean" after configure would kill at least this backdoor. A *very* observant (unreasonably so) user might notice that "make" did not build the objects that the backdoor provided.
Of course, an attacker could sneak around this as well by moving the process for unpacking the backdoor object to a Makefile rule, but that is more likely to "stick out" to an observant user, as well as being an easy target for automated analysis ("Which files have 'special' rules?") since you cannot obfuscate those from make(1) and expect them to still work. In this case, the backdoor was ultimately discovered when it caused performance problems in sshd, which should not be using liblzma at all, but gets linked with it courtesy of libsystemd on major GNU/Linux distributions. Yes, this means that systemd is a contributing factor to this incident, and that is aggravated by its unnecessary use of excessive dependencies. (Sending a notification that a daemon is ready should /not/ require compression support of any type. The "katamari" architecture model used in systemd had the effect here of broadening the supply-chain attack surface for OpenSSH sshd to include xz-utils, which is insane.)
The bulk of the attack payload seems to have been stored in the Git repository, disguised as binary test data in files tests/files/{bad-3-corrupt_lzma2.xz,good-large_compressed.lzma}. The modified build-to-host.m4 merely added code to configure to start the process of unpacking the backdoor. In a build from Git, the legitimate build-to-host.m4 would get copied in from gnulib and the backdoor would remain hidden.
Maybe the best revision to the GNU Coding Standards would be that releases should, if at all possible, contain only text? Any binary files needed for testing can be generated during "make check" if necessary, with generator programs packaged (as source or scripts) in the release.
-- Jacob
[Prev in Thread] | Current Thread | [Next in Thread] |