[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
bug#19393: 25.0.50; Emacs cannot determine coding system of ISO-8859 enc
From: |
Eli Zaretskii |
Subject: |
bug#19393: 25.0.50; Emacs cannot determine coding system of ISO-8859 encoded files |
Date: |
Tue, 16 Dec 2014 18:20:25 +0200 |
> Date: Tue, 16 Dec 2014 18:05:38 +0200
> From: Eli Zaretskii <eliz@gnu.org>
> Cc: 19393@debbugs.gnu.org
>
> > From: Tassilo Horn <tsdh@gnu.org>
> > Date: Tue, 16 Dec 2014 16:21:10 +0100
> >
> > ftp://ftp.fu-berlin.de/pub/misc/movies/database/movies.list.gz
> >
> > which contains all movies known to the international movie database
> > (IMDb.com). When I open that file using "emacs -Q movies.list.gz" (or
> > unzip it first) and then do M-x describe-coding-system I can see that it
> > is "t -- raw-text-unix". As a result of this, the last movie in that
> > file is displayed as "\374\347 (2012) 2012".
> >
> > However, according to the `file' command, the file is plain ISO-8859.
>
> Looks like some kind of bug, although with such a large file, it's not
> easy to be sure.
Actually, I don't think this is a bug. There are ISO-8859-15
characters in that file that are not part of ISO-8859-1, so Emacs will
not detect that encoding unless either (a) your locale dictates that
encoding, or (b) you change the preferences to prefer ISO-8859-15.
This is so with any 8-bit encoding -- EMacs cannot easily distinguish
between them, and needs some guidance.