bug-gzip
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: spec compliance: header CRC?


From: Greg Roelofs
Subject: Re: spec compliance: header CRC?
Date: Thu, 12 Aug 2010 22:58:49 -0700

Paul Eggert wrote:

>>      http://gregroelofs.com/test/testCompressThenConcat.txt.gz

> Thanks, I've verified that the new code works with that example.
> It's a bit much to turn that into a test case.  Perhaps if I
> find time I'll write a smaller one.

I just attached to https://issues.apache.org/jira/browse/MAPREDUCE-1927
a small zipfile containing a script and some binary bits to generate all
32 non-encrypted gzip header variants.  You can either grab that or find
the results here:

        http://gregroelofs.com/test/all-gz-header-types-20100812.zip

The script is dependent on this CRC checker:

        http://gregroelofs.com/code/check-latest.tgz

(There's probably some standard util available these days, but I didn't
look.)

Oh, and the gzip file above (3409 bytes) was slightly broken; I had
overlooked the part about extra-field subfields, so it was missing the
two ID bytes and two subfield-length bytes.  I just replaced it with a
3413-byte corrected version.  (gzip doesn't look inside it anyway, but
if you want a correct test case, here you go.  All 16 of the new ones
share the same fix, btw.)

I've also tested your 20100703 header-CRC patch (backported to 1.4), and
it works great--thanks again.  Related fixes/suggestions:

 - the "header16 != crc16" error message in gzip.c would look nicer with
   a pair of "%04x" instead of "%x".  (A CPAN Perl module has a zero-the-
   upper-byte bug, and the mismatched one-byte vs. two-byte error messages
   looked weird.)

 - algorithm.doc needs updating:
   - "bit 1 set: continuation of multi-part gzip file" -> "bit 1 set: header
     CRC-16 present"
   - new "? bytes  optional 16-bit header CRC" line immediately after
     "? bytes  optional file comment, zero terminated"
   - should mention spec at http://www.ietf.org/rfc/rfc1952.txt

Also, are you aware of _any_ gzip utilities that support encryption?  The
full, drop-in crypt.c and crypt.h have been available for a decade, but
it appears that the gzip code is explicitly not wired to support it.  Does
anything out there do so?  If not, you might want to add another note to
algorithm.doc mentioning that crypto's not official in either the code
sense or the spec sense.  (Or did an older version of gzip support it in
gzip files?)

Thanks,
  Greg



reply via email to

[Prev in Thread] Current Thread [Next in Thread]