monotone-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Monotone-devel] Re: Bug in CRLF conversions


From: Ethan Blanton
Subject: Re: [Monotone-devel] Re: Bug in CRLF conversions
Date: Thu, 2 Feb 2006 09:41:03 -0500
User-agent: Mutt/1.4.1i

Richard Levitte - VMS Whacker spake unto us the following wisdom:
> In message <address@hidden> on Wed, 1 Feb 2006 21:19:37 -0500, Ethan Blanton 
> <address@hidden> said:
> eblanton> Given this set of rules, I claim that *text conversions* are
> eblanton> as optimal as can be expected.  Monotone seems to differ
> eblanton> from this case in that CR is also considered for conversion,
> eblanton> _even if bare CR is not a platform line ending_.  *This* is
> eblanton> what I say is broken, and needs to be fixed.
> 
> monotone differs in that it currently always considers CR, LF and CRLF
> to be line endings.  They will all be converted to LF on checkin.

This is what I consider "wrong".

> eblanton> There is not.  But there is currently a TEXT FILE corruption
> eblanton> problem.  Binary files are just a complication thrown into
> eblanton> the mix.  I agree that the binary file problem also has to
> eblanton> be solved.
> 
> Ah, this one hit the nail.  We still disagree on what a text file
> really is.  To me, a "text file" with embedded control characters (ANY
> embedded control character, like \r on a Unix system unless it's part
> of \r\n) are not really text files, they should be considered binary.
> Whatever conversion you do one those, you can be sure the result will
> be trash on a platform with different line endings.

My pushback is that \r and \n are both perfectly valid ASCII
whitespace; I expect to see exceptions for, e.g., \r, \n, \v, \t, and
^h -- because all of these are perfectly valid in ASCII *text* files.

> Maybe we should talk about "transformable" and "non-transformable"
> files, so we don't get confused by what we consider to be text and
> binary?  I think I'll do so from here on.

Agreed.  That separates the two debates nicely until a dividing line
is decided upon.  ;-)

> Also, because of this, it seems like you and I put different focus on
> "transformable file corruption" and "non-transformable file
> corruption".  I'm much more worried about the latter (and please
> understand that from my point of view, those "text files" with
> embedded control characters are really part of the latter).

Honestly, I don't give a flying rat's butt about line ending
transformations in any *practical* sense -- I never use any platform
that isn't bare LF, and I seldom associate (in terms of repository
sharing, although it stands in general) with people who use other line
endings.  It's all on principle, here.  However, I also use revision
control to manage a LOT of non-source files, both transformable and
otherwise.

> Also, from the point of view of what needs to be changed in monotone,
> I see the "don't touch this file" and the "conversion of line ending"
> problems as quite separate.  The only real connection between the two
> is that a "don't touch this file" marking would prevent "conversion of
> line ending" from occuring.  Matbe we should discuss handling of
> non-transformable files in another thread?

Agreed on all points.

[snip]

> eblanton> I don't understand what is not reversible about the above.
> eblanton> There are situations where it breaks down (e.g., CR local
> eblanton> line ending, and actual CRs occur in the text document), but
> eblanton> it is a far sight better than what I understand is the
> eblanton> current conversion.
> 
> Oh, so you and I do understand "reversible" differently.  To me,
> "reversible" means there is a 1 to 1 mapping between the original file
> (the one being checked in) and the resulting file (the one being check
> out) in all possible cases.  If there's any way when a conversion back
> then forth doesn't get you back to the original, I can't see that as
> reversible.  Then again, I'm half Swedish, half French, and English is
> just my third language, so it's quite possible we're hitting a natural
> language issue.  You tell me.

No, I'm abusing "reversible", it's my fault.  I really mean something
like "reversible under reasonable circumstances where monotone is
currently not doing something equally reversible".  It is certainly
not a 100% reversible transformation, and I probably should have not
used this term.

> And so we don't get side-tracked in another argument; no, I'm saying
> that monotone's current behavior is reversible.  It isn't.  But for
> what I consider a text file, I don't see it as an issue.
> 
> Btw, something in Yury's example made me think a bit, and it occured
> to me that if the database line ending would be CRLF and we only
> convert from and to the platform specific line ending, we would have
> something reversible the way I understand "reversible" (this is under
> the condition that no file ever magically appears in the database
> without having been committed to it).  And in this case, it would work
> to do this with ALL files back and forth, even non-transformables.
> Think about it.  However, if we do this, we will get migration hell.

A quick mental exercise suggests that CRLF works with both my mental
model of text files and yours -- it's more or less byte stuffing, as I
mentioned before.  And, I agree that from the migration point of view
it's probably a bad idea.  I remain inclined to tell monotone not to
jack with line endings which aren't platform-appropriate, and leave
the behavior more or less as it is now, calling it "solved enough".

Ethan

-- 
The laws that forbid the carrying of arms are laws [that have no remedy
for evils].  They disarm only those who are neither inclined nor
determined to commit crimes.
                -- Cesare Beccaria, "On Crimes and Punishments", 1764

Attachment: pgp7xDJHo5q2n.pgp
Description: PGP signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]