monotone-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Monotone-devel] Bug in CRLF conversions


From: Yury Polyanskiy
Subject: [Monotone-devel] Bug in CRLF conversions
Date: Sat, 28 Jan 2006 12:19:28 -0500

Hi all!

I think there is a bug in CRLF<->LF conversions or at least unintended
behavior.

Suppose the following scenario.

function get_linesep_conv(fname)
                return {"LF", "CRLF"}
end

I expect to have all files translated from LF to CRLF after checkout
with this hook. Indeed that happens. However, suppose I have a binary
file which is (in case of such a dumb hook) fed through line translation
procedure. Then each occurence of 0x0a *OR* 0x0d is replaced by two
bytes [0x0d 0x0a]. 

Now if I try to commit the translation procedure is applied backwards.
But now all occurences of [0x0d 0x0a] are replaced by 0x0a. 

Overall I got all 0x0d's translated into 0x0a's. And this will be
committed to database and eventually ruin everyone else's binaries as
well.

I understand that I should not apply CRLF stuff to binary files.
However, if I mistakenly did I can expect that my checked out binary
will be bad. But I DO NOT expect that this is going to change database
version of the binary, right?

The problem here is that translation procedure is not reversible (it
replaces 0x0d's *OR* 0x0a's by \r\n instead).

Speaking programmatically the solution is simple: add one more argument
to line_end_convert(). I.e. make it like

void line_end_convert(linesep_from, linesep_to, string, out).

and translate ONLY linesep_from to linesep_to. In my case that will lead
to translating only 0x0a's to \r\n. 0x0d's are left untouched.

Moreover, current implementation of line_end_convert() working through
split_lines(), join_lines() is less efficient then what is proposed
above.

So what do you think? 

WBR,
Yury.





reply via email to

[Prev in Thread] Current Thread [Next in Thread]