monotone-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Monotone-devel] Non-ascii filename under win32


From: Nathaniel Smith
Subject: Re: [Monotone-devel] Non-ascii filename under win32
Date: Wed, 20 Sep 2006 05:21:27 -0700
User-agent: Mutt/1.5.13 (2006-08-11)

On Mon, Sep 18, 2006 at 09:49:47PM +0200, Jesper Ribbe wrote:
> Hello,
> 
> I'm a new user to Monotone and am currently evaluating it under win32.
> I've tried to add a filename which contain the swedish character "Å", 
> but fail to get this to work.
> 
> The error message I get is:
> mtn: fatal: std::logic_error: paths.cc:255: invariant 
> 'I(utf8_validate(path))' violated

Right -- the theory is that monotone uses your local filesystem
charset when talking to your local filesystem, uses utf8 internally
(so as to have a canonical format that everyone can use, even if they
have different local charsets), and converts as necessary.

I assume your local character set is something non-unicode, like an
ISO-8859 variant or similar?  (If you had included the dump file
monotone makes when it crashes like that, it includes some information
on what locale settings monotone thinks you are using.)

The problem is probably that it isn't converting something when it
should.  There are some known bugs in this stuff, and no-one's gotten
around to doing a systematic audit/fixup yet.  If you're curious,
some are marked "BUG" in the source; and the main code involved is
paths.hh/.cc.  We're pretty good about this stuff when it comes time
to look for known files, or write files out; the bug you're most
likely running into is, when we go out and ask the filesystem what
files exist (the way "mtn add <directory>" does, for instance), we
don't convert from filesystem charset->utf8.

If that's the issue, a possible workaround is to pass the names of the
offending files you want to add explicitly on the command line,
instead of letting monotone find them by searching the filesystem --
command line arguments are converted correctly.

It's also possible you're running into some sort of other bug.  Win32
is particularly arcane about filesystems and character sets, so we
might be doing things wrong somehow.

> I've tried to set the environment variable CHARSET to "CP850", "CP437", 
> "UTF-8", "UTF-16", "ISO-8859-1" etc. With some (ie UTF-16) I cannot add 
> any filename.

CHARSET, if you set it at all, should be set to the character set you
use for your own typing, and that your filesystem uses.  If you set it
to UTF-16 and said "mtn add mydir", monotone probably tried to convert
"mydir" out of UTF-16 and failed miserably...

> I've also tried to make a monotonerc:
> function get_charset_conv(filename)
>   return "UTF-8","CP850"
> end
> but still no success.

Don't use that hook -- it's for munging the character set of the data
inside files, and is probably a Bad Idea in any case.

-- Nathaniel

-- 
In mathematics, it's not enough to read the words
you have to hear the music

This email may be read aloud.




reply via email to

[Prev in Thread] Current Thread [Next in Thread]