[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Bug-wget] Invalid Content-Length header in WARC files, on some plat
From: |
Tim Ruehsen |
Subject: |
Re: [Bug-wget] Invalid Content-Length header in WARC files, on some platforms |
Date: |
Tue, 13 Nov 2012 09:48:37 +0100 |
User-agent: |
KMail/1.13.7 (Linux/3.2.0-4-amd64; KDE/4.8.4; x86_64; ; ) |
Hello Gis,
just out of curiosity.
What about setting the compiler option -D _FILE_OFFSET_BITS=64 on these
systems ?
Since off_t is used in many places for file length, there should be many more
problems regarding large files. I just wonder how to generally handle large
files on these PowerPC and ARM systems. If there is no such general way, using
off_t wouldn't make sense (except these systems can't handle large files at
all - but then your patch doesn't make sense).
Maybe you could bring some light...
Regards, Tim
Am Monday 12 November 2012 schrieb Gijs van Tulder:
> Hi,
>
> There's a somewhat serious issue in the WARC-generating code: on some
> platforms (presumably the ones where off_t is not a 64-bit number) the
> Content-Length header at the top of each WARC record has an incorrect
> length. On these platforms it is sometimes 0, sometimes 1, but never the
> correct length. This makes the whole WARC file unreadable.
>
> The code works fine on many platforms, but it is apparently a problem on
> some PowerPC and ARM systems, and maybe other systems as well.
>
> Existing WARC files with this problem can be repaired by replacing the
> value of the Content-Length header with the correct value, for each WARC
> record in the file. The content of the WARC records is there, it's just
> the Content-Length header that is wrong.
>
> The attached patch fixes the problem in warc.c. It replaces off_t by
> wgint and uses the number_to_static_string function from util.c.
>
> Regards,
>
> Gijs
- [Bug-wget] Invalid Content-Length header in WARC files, on some platforms, Gijs van Tulder, 2012/11/12
- Re: [Bug-wget] Invalid Content-Length header in WARC files, on some platforms,
Tim Ruehsen <=
- Re: [Bug-wget] [PATCH] Invalid Content-Length header in WARC files, on some platforms, Tim Ruehsen, 2012/11/14
- Re: [Bug-wget] [PATCH] Invalid Content-Length header in WARC files, on some platforms, Daniel Stenberg, 2012/11/14
- Re: [Bug-wget] [PATCH] Invalid Content-Length header in WARC files, on some platforms, Ángel González, 2012/11/14
- Re: [Bug-wget] [PATCH] Invalid Content-Length header in WARC files, on some platforms, Tim Ruehsen, 2012/11/14
- Re: [Bug-wget] [PATCH] Invalid Content-Length header in WARC files, on some platforms, David Ryskalczyk, 2012/11/14
- Re: [Bug-wget] [PATCH] Invalid Content-Length header in WARC files, on some platforms, Tim Ruehsen, 2012/11/14
- Re: [Bug-wget] [PATCH] Invalid Content-Length header in WARC files, on some platforms, Ángel González, 2012/11/14