[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Automake 1.11.2 released
From: |
Antonio Diaz Diaz |
Subject: |
Re: Automake 1.11.2 released |
Date: |
Mon, 26 Dec 2011 18:02:41 +0100 |
User-agent: |
Mozilla/5.0 (X11; U; Linux i586; en-US; rv:1.7.11) Gecko/20050905 |
Hello Miles,
Miles Bader wrote:
What's the difference between xz and lzip anyway...?
I've never even heard of lzip, but the debian package description makes
it sound very similar to xz...
The main difference between xz and lzip is that xz includes some binary
filters for executable code of some processors, but I do not see this as
an advantage at all because:
1) The binary filters of xz are useless for compressing source tarballs.
2) New filters have to be added to the format as new processors appear
in the market.
3) Those so-called BCJ filters have known problems and it is planned to
replace them in a future version of xz. The xz format is not the most
stable alternative for long term archiving. See this quote from the xz
man page:
-------------------------------------------------------------------
These BCJ filters have known problems related to the compression
ratio:
· Some types of files containing executable code (e.g. object
files, static libraries, and Linux kernel modules) have the
addresses in the instructions filled with filler values.
These BCJ filters will still do the address conversion, which
will make the compression worse with these files.
· Applying a BCJ filter on an archive containing multiple simi-
lar executables can make the compression ratio worse than not
using a BCJ filter. This is because the BCJ filter doesn't
detect the boundaries of the executable files, and doesn't
reset the address conversion counter for each executable.
Both of the above problems will be fixed in the future in a new
filter. The old BCJ filters will still be useful in embedded
systems, because the decoder of the new filter will be bigger
and use more memory.
-------------------------------------------------------------------
4) The xz format is supposed to be extensible, but it will not be
extended with new compression algorithms, just as gzip wasn't. It makes
no sense to combine two or more different compression algorithms,
probably with different command line options, in a big executable.
5) The xz format is fragmented. See this quote also from the xz man page:
-------------------------------------------------------------------
Embedded .xz decompressor implementations like XZ Embedded don't
necessarily support files created with integrity check types other
than none and crc32. Since the default is --check=crc64, you must use
--check=none or --check=crc32 when creating files for embedded
systems.
Outside embedded systems, all .xz format decompressors support all the
check types, or at least are able to decompress the file without
verifying the integrity check if the particular check is not
supported.
XZ Embedded supports BCJ filters, but only with the default start
offset.
-------------------------------------------------------------------
OTOH, the lzip family of programs has some genuine advantages over xz:
1) Lzip is copylefted. This should be important for us in GNU.
2) Lunzip is a decompressor for lzip files much smaller than xzminidec
(the xz-embedded "small" xz decompressor), and can decompress any lzip
file, while xzminidec can only decompress specially crafted xz files.
lunzip (31kB)
lzip (89kB)
xzminidec (171kB)
BTW, some programs of the lzip family, like lunzip, are written in C for
better portability to embedded and mobile systems.
3) The dictionary size encoded by lzip is more fine-grained than that of
xz, saving memory when decompressing.
4) Lziprecover can recover corrupt lzip files with an efficacy never
seen before in a gzip-like compressed format. And it can recover files
produced by any of the compressors in the lzip family, as all of them
are compatible. No such tool exists for xz, and given the complexity and
extensibility of the xz format, I think an effective recovery tool for
xz can't be written.
5) The lzip family includes plzip, a massively parallel (multi-threaded)
compressor.
6) There exist three related but independent compressor implementations
producing files in lzip format (lzip, clzip and minilzip/lzlib) which
are verified to produce bit-identical output, much like 3-way redundancy
in mission-critical software. AFAIK, xz implementations are not tested
to this level.
7) Using xz for software distribution may not be be much of a problem,
the format of compressed tarballs can be changed overnight, but for
long-term archiving, the simpler the format the more probable is to
recover the data decades after.
Best regards,
Antonio.
- Automake 1.11.2 released, Stefano Lattarini, 2011/12/22
- Re: Automake 1.11.2 released, Miles Bader, 2011/12/25
- Re: Automake 1.11.2 released,
Antonio Diaz Diaz <=
- Re: Automake 1.11.2 released, Bob Friesenhahn, 2011/12/26
- Re: Automake 1.11.2 released, Dave Hart, 2011/12/27
- Re: Automake 1.11.2 released, Bob Friesenhahn, 2011/12/27
- Re: Automake 1.11.2 released, Antonio Diaz Diaz, 2011/12/27
- Add support for user-defined compressors (was: Re: Automake 1.11.2 released), Stefano Lattarini, 2011/12/30