bug-gzip
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Error (bug) in gunzip


From: Bob Proulx
Subject: Re: Error (bug) in gunzip
Date: Mon, 10 Sep 2007 22:26:55 -0600
User-agent: Mutt/1.5.9i

Please keep replies to the mailing list so that others may participate
in the discussion.  Thanks.

Also, your mailer is very strangely adding carriage returns to every
line.  This is making your messages much more difficult to read than
most messages.  If there is anything that you could do to fix that it
would definitely make reading and responding to your email easier.

address@hidden wrote:
> The file is created on Windows with Winzip
> Original File size is around 54GB, zipped size around 13 GB
> 
> Only 1 file.

Okay.  In theory and as documented gunzip should be able handle it.

> This .zip file is copied using winscp (binary mode) from Windows server to
> Linux Server

I can see that you know about the problem of text mode file copy but
if you can verify the binary bit integrity of the file after the copy,
such as with md5sum before and after, it would help to confirm that it
is definitely not getting corrupted in the copy.

> The unzip doesn't want to know about it (Very likely file to big).

Those are quite large sized files.  I am sure that they take quite a
long time to process on both sides.

> Small files OK see my example:

A good data point.

> address@hidden Database]# ls -liatr
> total 51289396
> 8863761 drwxrwxrwx    4 sybase   root         4096 Sep  5 11:13 ..
> 9256980 -rw-r--r--    1 sybase   root      4696064 Sep 10 19:31 master.dmp
> 9256982 -rw-r--r--    1 sybase   root     59899904 Sep 10 19:32 
> sybsystemprocs.dmp
> 9256983 -rw-r--r--    1 sybase   root     52403593216 Sep 10 20:38 
> LifetrackTest01.dmp
> 9256984 -rw-r--r--    1 sybase   root       446464 Sep 10 20:38 
> sybsystemdb.dmp
> 9256979 drwxrwxrwx    2 sybase   root         4096 Sep 11 13:57 .

> 9256981 -rw-r--r--    1 sybase   root       374784 Sep 10 19:31 model.dmp
> address@hidden Database]# zip model  model.dmp
>   adding: model.dmp (deflated 84%)

Okay.  Looks good.

> address@hidden Database]# zip Lifetrack LifetrackTest01.dmp
>         zip warning: name not matched: LifetrackTest01.dmp
>
> zip error: Nothing to do! (Lifetrack.zip)

Hmm...  That looks like an error upon zip of the file.  Is that really
what you were intending to show there?  I assume that on another
invocation that it was successful or you would not have a file, right?

You said that 'unzip' could not handle the file, probably because it
was too large.  But could 'gzip' here work instead of 'zip'?  Then you
would have gzip both compressing the file and uncompressing it.  That
should have a better chance of success.

> gunzip seems to have a problem with it and complains about data length
> error.

What version of gzip are you using?  I fear it might be an older 1.2
version.  There were many improvements to handle large files in the
1.3 versions.  (However I see just now that you do say that you can
gzip and gunzip the file successfully so I guess this is answered
indirectly by that statement but I will continue to ask it anyway.)

  gzip --version | head -n1

The latest stable version is 1.3.12 available from the gnu ftp
depots.  Here is the information about it.

  http://lists.gnu.org/archive/html/bug-gzip/2007-04/msg00002.html

Unfortunately this will be hard to debug without the files that are
causing the problem and they are very much too big of a test case.

> I now use the pscp (This is a putty utility running on windows that takes
> care of a secure copy from windows to Unix by using internally the ssh
> functionality.)

pscp.exe is a good program.

> There is a parameter -C . This takes care of compressing the data on the
> sending end and decompress at the receiving end thereby reducing the
> bandwidth.

I am confident that the file is copied correctly and I know that it
will take a long time to run an integrity check across that much data
but if you could verify that the files were the same on both sides
that would be a good thing.  Let it run overnight.  :-)

  md5sum LifetrackTest01.dmp

If the signatures match then we can be pretty sure that the files are
the same both before and after the copy.

> Anyway when I have the 54 GB file on my Linux box I tried gzip and gunzip
> and encountered no problems at all.

Good to hear.

> So my conclusion is that winzip does something with the formatting of the
> output file that gzip doesn't like.

That is possible.  That is why I suggest using gzip on both sides of
the process.

I don't know if anyone on this mailing list is familiar with the
internals of winzip.  Having gzip be able to handle a special case
.zip file is somewhat of a special case.  I am not sure there is a
great return on investment in order to try to make that case work
because the source end is out of our control.  Using gzip to compress
the file would put that program on both ends.  You seem to have other
GNU tools available.  Do you have gzip available there too?  If not
there are versions that can be downloaded for it.

> Hope this explains it a bit.

It was useful.  Thanks.

Bob


address@hidden wrote:
> 
> Hi Bob,
> 
> The file is created on Windows with Winzip
> Original File size is around 54GB, zipped size around 13 GB
> 
> Only 1 file.
> 
> This .zip file is copied using winscp (binary mode) from Windows server to
> Linux Server
> 
> The unzip doesn't want to know about it (Very likely file to big).
> 
> Small files OK see my example:
> address@hidden Database]# ls -liatr
> total 51289396
> 8863761 drwxrwxrwx    4 sybase   root         4096 Sep  5 11:13 ..
> 9256980 -rw-r--r--    1 sybase   root      4696064 Sep 10 19:31 master.dmp
> 9256981 -rw-r--r--    1 sybase   root       374784 Sep 10 19:31 model.dmp
> 9256982 -rw-r--r--    1 sybase   root     59899904 Sep 10 19:32
> sybsystemprocs.dmp
> 9256983 -rw-r--r--    1 sybase   root     52403593216 Sep 10 20:38
> LifetrackTest01.dmp
> 9256984 -rw-r--r--    1 sybase   root       446464 Sep 10 20:38
> sybsystemdb.dmp
> 9256979 drwxrwxrwx    2 sybase   root         4096 Sep 11 13:57 .
> address@hidden Database]# zip model  model.dmp
>   adding: model.dmp (deflated 84%)
> address@hidden Database]# zip Lifetrack LifetrackTest01.dmp
>         zip warning: name not matched: LifetrackTest01.dmp
> 
> zip error: Nothing to do! (Lifetrack.zip)
> 
> address@hidden Database]# ls -liatr
> total 51289460
> 8863761 drwxrwxrwx    4 sybase   root         4096 Sep  5 11:13 ..
> 9256980 -rw-r--r--    1 sybase   root      4696064 Sep 10 19:31 master.dmp
> 9256981 -rw-r--r--    1 sybase   root       374784 Sep 10 19:31 model.dmp
> 9256982 -rw-r--r--    1 sybase   root     59899904 Sep 10 19:32
> sybsystemprocs.dmp
> 9256983 -rw-r--r--    1 sybase   root     52403593216 Sep 10 20:38
> LifetrackTest01.dmp
> 9256984 -rw-r--r--    1 sybase   root       446464 Sep 10 20:38
> sybsystemdb.dmp
> 9256985 -rw-r--r--    1 root     root        60031 Sep 11 13:57 model.zip
> 9256979 drwxrwxrwx    2 sybase   root         4096 Sep 11 13:57 .
> 
> 
> 
> gunzip seems to have a problem with it and complains about data length
> error.
> 
> I now use the pscp (This is a putty utility running on windows that takes
> care of a secure copy from windows to Unix by using internally the ssh
> functionality.)
> 
> There is a parameter -C . This takes care of compressing the data on the
> sending end and decompress at the receiving end thereby reducing the
> bandwidth.
> 
> Anyway when I have the 54 GB file on my Linux box I tried gzip and gunzip
> and encountered no problems at all.
> 
> So my conclusion is that winzip does something with the formatting of the
> output file that gzip doesn't like.
> 
> 
> 
> Hope this explains it a bit.
> 
> Thanks for your reply.
> 
> Regards
> 
> Jan Krimp
> DBA
> 
> Tech Solutions
> AMP FINANCIAL SERVICES
> L17, HP Tower
> 171 Featherston Street
> WELLINGTON, NZ
> DDI:      04   498 8653
> FAX:     04   498 8025
> MOB:    021 2898998
> 
> 
>                                                                            
>              address@hidden                                                
>              (Bob Proulx)                                                  
>                                                                         To 
>              11/09/2007 01:32          address@hidden                 
>              p.m.                                                       cc 
>                                        address@hidden                    
>                                                                    Subject 
>                                        Re: Error (bug) in gunzip           
>                                                                            
>                                                                            
>                                                                            
>                                                                            
>                                                                            
>                                                                            
> 
> 
> 
> 
> address@hidden wrote:
> > I want to use gunzip on linux to "uncompress" a winzip file
> 
> If you have 'unzip' available that would almost certainly always be
> the preferred tool to unzip a zip file.
> 
> > In the meantime we us Putty's pscp.
> 
> Uhm...  I do not understand the connection between pscp, an SSH file
> copy client, and either gunzip or zip files.  I don't see any
> connection.
> 
> > address@hidden jk]# gunzip -t -S .zip Lifetrack01
> >
> > gunzip: Lifetrack01.zip: invalid compressed data--length error
> 
> The documentation says:
> 
>        Files created by zip can be uncompressed by gzip only if they
>        have a single member compressed with the deflation method. This
>        feature is only intended to help conversion of tar.zip files to
>        the tar.gz format.  To extract a zip file with a single member,
>        use a command like gunzip <foo.zip or gunzip -S .zip foo.zip.
>        To extract zip files with several members, use unzip instead of
>        gunzip.
> 
> What is in your zip file?  If it contains multiple files or does not
> match the expected zip format then this won't work.
> 
>   unzip -v Lifetrack01.zip
> 
> Better to use unzip on zip file and simply avoid the issue entirely.
> 
> Bob
> 
> 
> 
> -------------------------------------------------------------------------------------------------------------------------------------
> This email message and any accompanying attachments may contain information 
> that is confidential and subject to legal privilege. 
> If you arenot the intended recipient, do not read, use, disseminate, 
> distribute or copy this message or attachments.  
> If you have received this message in error, please notify the sender 
> immediately and delete this message.
> Any views expressed in this message are those of the individual sender and 
> may not necessarily reflect the views of AMP. 
> Please note that this communication does not designate an information system 
> for the purposes of the Electronic Transactions Act 2002.
> You can contact AMP Financial Services by calling 0800  808 267 or by 
> emailing us at address@hidden 
> 
> 




reply via email to

[Prev in Thread] Current Thread [Next in Thread]