bug-gzip
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#22768: Crash safety


From: Paul Eggert
Subject: bug#22768: Crash safety
Date: Sun, 28 Feb 2016 00:30:59 -0800
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.5.1

Antonio Diaz Diaz wrote:

it may be a cause of feature creep. If gzip fsyncs the output file it
might also test it, or even compare it with the input file, before deleting the
input file.

Feature creep is something we should avoid. Here, though, it's a real pain to synchronize correctly and many people will get it wrong. (See my commentary at the end of this email for one example of getting it wrong.) By comparison, comparing the decompressed output with the input file is something that most people will probably get right, so it's less useful to add a gzip option for that.

Second, as doing it right in all circumstances may be impossible

Sure, as some file systems do not support fsync. Still, gzip should do what it 
can.

it may become an endless source of bug reports.

I doubt it. gzip has run unsafely for decades, and this is the first bug report about it -- one discovered by code inspection, not by actual failure.

(fsyncing also the destination's directory,

Yes, that needs fixing.  Done in the patches I just now emailed to you.

opening the output with O_DIRECT,...).

I doubt whether that feature will be needed or useful for gzip.

Third, it fights against other layers of the system, like the filesystem,
instead of collaborating with them.

True, fsync is a bad design. But that is no excuse for gzip losing data.

Fourth, it fights against user's wishes instead of obeying them.

This should not be a problem if --synchronous is a new option, defaulting to the old (unsynchronized) behavior.

I think that the best way of guarding an important file against all bugs and
crashes is a extended version of the procedure already documented in the manual
of lzip:

1) gzip --keep file    # don't delete input
2) sync            # commit output and directory to disk
3) zcmp file file.gz    # verify output
4) rm file        # then remove input

That approach does not suffice, because 'sync' does not guarantee that the output data has been synchronized to disk. See:

http://pubs.opengroup.org/onlinepubs/9699919799/functions/sync.html

With GNU 'sync' there is a workaround, but it is not portable to non-GNU systems; besides, the workaround is not obvious.





reply via email to

[Prev in Thread] Current Thread [Next in Thread]