[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
bug#67926: 29.1; fail to extract ZIP subfile named with [...]
From: |
awrhygty |
Subject: |
bug#67926: 29.1; fail to extract ZIP subfile named with [...] |
Date: |
Thu, 04 Jan 2024 04:53:26 +0900 |
User-agent: |
Gnus/5.13 (Gnus v5.13) |
Eli Zaretskii <eliz@gnu.org> writes:
>> My interest is how to avoid naming problems.
>> There are more difficulties in Japanese.
>> Japanese characters in file names are normally encoded in cp932.
>> Encoded characters may have '[', '\' or ']' as a second byte.
>> (encode-coding-string "ゼソゾ" 'cp932)
>> => "\203[\203\\\203]"
>> Subfiles of such names can not be extracted normally.
>
> I don't think we can solve this in Emacs: non-ASCII file names in zip
> archives are a mess, even before you consider the fact that zip
> archives are frequently moved between systems. For starters, how can
> one know in advance what is the encoding of file names in an arbitrary
> zip archive? This will bite you even if we do everything in Emacs,
> and even if someone does submit patches to implement all the
> compression methods.
So I need a extractor without subfile names.
It is more usefull to extract contents with broken names than unable to
extract contents at all.
And I found my unzip.exe cannot extract BZIP2 or LZMA compressed
subfiles created by python zipfile module. I doubt unzip.exe does not
work for all compression methods.
By the way, I didn't know zlib-decompress-region function.
Now subfiles compressed with deflate method can be extracted
only with elisp program.
(advice-add #'archive-zip-extract :override
#'archive-zip-decompress-content)
(defun archive-zip-decompress-content (archive name)
(let* ((desc archive-subfile-mode)
(buf (current-buffer))
(bufname (buffer-file-name)))
(set-buffer archive-superior-buffer)
(save-restriction
(widen)
(let* ((file-beg archive-proper-file-start)
(p0 (+ file-beg (archive--file-desc-pos desc)))
(p (+ file-beg (archive-l-e (+ p0 42) 4)))
(bitflags (archive-l-e (+ p 6) 2))
(method (archive-l-e (+ p 8) 2))
(compsize (archive-l-e (+ p0 20) 4))
(fn-len (archive-l-e (+ p 26) 2))
(ex-len (archive-l-e (+ p 28) 2))
(data-beg (+ p 30 fn-len ex-len))
(data-end (+ data-beg compsize))
(coding-system-for-read 'no-conversion)
(coding-system-for-write 'no-conversion)
(default-directory temporary-file-directory))
(cond ((/= 0 (logand bitflags 1))
(message "Subfile is encrypted"))
((= method 0)
(with-current-buffer buf
(insert-buffer-substring archive-superior-buffer
data-beg data-end)))
((eq method 8)
(let ((crc-32 (buffer-substring (+ p0 16) (+ p0 20)))
(orig-size (buffer-substring (+ p0 24) (+ p0 28)))
(header "\x1f\x8b\x08\0\0\0\0\0\0\0"))
(with-current-buffer buf
(set-buffer-multibyte nil)
(insert header)
(insert-buffer-substring archive-superior-buffer
data-beg data-end)
(insert crc-32 orig-size)
(zlib-decompress-region (point-min) (point-max))
(set-buffer-multibyte 'to))))
((eq method 12)
(call-process-region data-beg data-end
"bzip2" nil buf nil "-cd"))
(t (message "Unknown compression method")))))
(set-buffer buf)))
- bug#67926: 29.1; fail to extract ZIP subfile named with [...],
awrhygty <=