lzip-bug
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Lzip-bug] Tarlz 0.23 released


From: Antonio Diaz Diaz
Subject: [Lzip-bug] Tarlz 0.23 released
Date: Sun, 25 Sep 2022 18:03:26 +0200
User-agent: Mozilla/5.0 (X11; U; Linux i586; en-US; rv:1.9.1.19) Gecko/20110420 SeaMonkey/2.0.14

I am pleased to announce the release of tarlz 0.23.

Tarlz is a massively parallel (multi-threaded) combined implementation of the tar archiver and the lzip compressor. Tarlz uses the compression library lzlib.

Tarlz creates tar archives using a simplified and safer variant of the POSIX pax format compressed in lzip format, keeping the alignment between tar members and lzip members. The resulting multimember tar.lz archive is fully backward compatible with standard tar tools like GNU tar, which treat it like any other tar.lz archive. Tarlz can append files to the end of such compressed archives.

Keeping the alignment between tar members and lzip members has two advantages. It adds an indexed lzip layer on top of the tar archive, making it possible to decode the archive safely in parallel. It also minimizes the amount of data lost in case of corruption. Compressing a tar archive with plzip may even double the amount of files lost for each lzip member damaged because it does not keep the members aligned.

Tarlz can create tar archives with five levels of compression granularity: per file (--no-solid), per block (--bsolid, default), per directory (--dsolid), appendable solid (--asolid), and solid (--solid). It can also create uncompressed tar archives.

Of course, compressing each file (or each directory) individually can't achieve a compression ratio as high as compressing solidly the whole tar archive, but it has the following advantages:

   * The resulting multimember tar.lz archive can be decompressed in
     parallel, multiplying the decompression speed.

   * New members can be appended to the archive (by removing the
     end-of-archive member), and unwanted members can be deleted from the
     archive. Just like an uncompressed tar archive.

   * It is a safe POSIX-style backup format. In case of corruption, tarlz
     can extract all the undamaged members from the tar.lz archive,
     skipping over the damaged members, just like the standard
     (uncompressed) tar. Moreover, the option '--keep-damaged' can be used
     to recover as much data as possible from each damaged member, and
     lziprecover can be used to recover some of the damaged members.

   * A multimember tar.lz archive is usually smaller than the corresponding
     solidly compressed tar.gz archive, except when individually
     compressing files smaller than about 32 KiB.

Note that the POSIX pax format has a serious flaw. The metadata stored in pax extended records are not protected by any kind of check sequence. Because of this, tarlz protects the extended records with a Cyclic Redundancy Check (CRC) in a way compatible with standard tar tools.

The homepage is at http://www.nongnu.org/lzip/tarlz.html

An online manual for tarlz can be found at http://www.nongnu.org/lzip/manual/tarlz_manual.html

The sources can be downloaded from
http://download.savannah.gnu.org/releases/lzip/tarlz/

The sha256sum is:
3cefb4f889da25094f593b43a91fd3aaba33a02053a51fb092e9b5e8adb660a3 tarlz-0.23.tar.lz


Changes in version 0.23:

* Tarlz now can create and decode the extended records 'atime' and 'mtime', allowing times beyond the ustar range (before 1970-01-01 00:00:00 UTC or after 2242-03-16 12:56:31 UTC).

* Tarlz now can create and decode the extended records 'uid' and 'gid', allowing user and group IDs beyond the ustar limit of 2_097_151.

* The new option '--ignore-overflow', which makes '-d, --diff' ignore differences in mtime caused by overflow on 32-bit systems, has been added.

* Tarlz now refuses to read archive data from a terminal or write archive data to a terminal. (Reported by DustDFG).

* In the date format of option '--mtime' the time of day 'HH:MM:SS' is now optional and defaults to '00:00:00'. Both space and 'T' are now accepted as separator between date and time.

* Diagnostics caused by invalid arguments to command line options now show the argument and the name of the option.

* Tarlz now diagnoses separately the failure to create an intermediate directory during extraction.

* Failure to extract a member due to environmental problems is no longer fatal in serial extraction. (It was already non-fatal in parallel extraction).

* The diagnostics emitted by the parallel decoder should now be identical to the corresponding diagnostics of the serial decoder.

* Column alignment has been improved in listings by printing "user/group size" in a field of minimum width 19 with at least 8 characters for size.

* The diagnostic shown when the filesystem reports a wrong st_size for a symbolic link has been improved. (Reported by Jason Lenz).

* The diagnostic "File is the archive" has been changed to "Archive can't contain itself" following a similar change made by Paul Eggert to GNU tar.

* The warning "Removing leading '/' from member names." is now not shown when compressing nor if the member causing it is excluded with '--exclude'.

* The texinfo category of the manual has been changed from 'Data Compression' to 'Archiving' to match that of GNU tar.

* 'end-of-archive' (EOA) is now used consistently to refer to the blocks of binary zeros used to mark the end of the archive.

* Operations are now listed before options in the --help output and in the manual.

  * Many small improvements have been made to the code and documentation.


Please send bug reports and suggestions to lzip-bug@nongnu.org


Regards,
Antonio Diaz, tarlz author and maintainer.

--
If you care about data safety and long-term archiving, please consider using lzip. See http://www.nongnu.org/lzip/lzip_benchmark.html
http://www.nongnu.org/lzip/manual/lzip_manual.html#Quality-assurance and
http://www.nongnu.org/lzip/safety_of_the_lzip_format.html Thanks.




reply via email to

[Prev in Thread] Current Thread [Next in Thread]