[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Section "2.10.4 The 'Block Check' field" in your paper: Xz format inadeq
From: |
Wolfgang Liessmann |
Subject: |
Section "2.10.4 The 'Block Check' field" in your paper: Xz format inadequate for long-term archiving |
Date: |
Tue, 28 Mar 2023 04:45:41 +0200 |
Dear Antonio Diaz Diaz,
In your paper
Xz format inadequate for long-term archiving
https://www.nongnu.org/lzip/xz_inadequate.html
in section "2.10.4 The 'Block Check' field“ you present a formula
Inaccuracy = ( compressed_size * Pudc + CS_size ) / ( compressed_size +
CS_size )
which is not explained (derived in a way so one sees its necessity).
In particular, the statement
It should be noted that SHA-256 provides worse accuracy than CRC64 for
all possible block sizes.
and the inaccuracy 3e-08 at 1e+09 bytes (= 1 GB) compressed size seems to
contradict simple cryptographic standards at the first glance.
The rule of thumb (the mathematical approximation) is that the security,
measured in bits of security, is half of the hash length (if there is no attack
on the algorithm, and there is no known effective attack on SHA-256 as of now),
which means that the probability of a collision of two SHA-256 hashes is
less than
1 / 2^(256/2)
= 1 / 2^128
= 1 / 10^(128 * ln(2) / ln(10))
= ca. 1 / 10^38.5
= ca. 10^-38.5
= ca. 1e-38.5
1e-38.5 is very far from (much less than) 3e-08, so it is unclear how the value
3e-08 is obtained.
While at the first glance lzip indeed seems to have a better design than xz,
and the compression rate is better comparing the maximum compression levels, it
is not clear to me why a secure cryptographic hash function should perform that
badly.
As others could be interested, too, you might answer publicly to my post at
https://stackoverflow.com/a/75852528/20996936
or in the lzip-bug Archives at
https://lists.nongnu.org/archive/html/lzip-bug/
Thanks!
- Section "2.10.4 The 'Block Check' field" in your paper: Xz format inadequate for long-term archiving,
Wolfgang Liessmann <=