gzz-dev
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Gzz] ``canon3_file_format``: A canonical, N3-based file format


From: Benja Fallenstein
Subject: Re: [Gzz] ``canon3_file_format``: A canonical, N3-based file format
Date: Wed, 02 Apr 2003 16:11:47 +0200
User-agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.3) Gecko/20030327 Debian/1.3-4

Tuomas Lukka wrote:
The ``NEWLINE`` token may be any of CR, LF, and CRLF.
(This is necessary for CVS to be useful across platforms.)
In contexts where the specific form used matters,
the newline character is LF. (In particular, when computing
a content hash-- e.g., when creating a Canon3 Storm block.)

This is just asking for trouble!

Can you be more specific?

If we want to use it with CVS, requiring LF would mean that the files would have to be added as binary; AFAIK CVS wouldn't do diffing then.

The triples must be ordered.

Capitalize "must" ;)

Which definition do you want to use?

Two triples are compared
by comparing their subjects, properties, and objects
in this order. Each of these parts is compared
as follows:

- Literals are lower than (go before) URIrefs,
 which go before anonymous nodes.

??????

What's the question?

- URIrefs are compared character-by-character,
 in the form as defined in [RFC 2396]
 (i.e., *after* Unicode characters outside
 the ASCII range have been escaped).
 Characters are compared by Unicode code point
 value.

Is this the same as a lexicographic string comparison
of the UTF-8 encoded one?

I don't know.

``URIref`` is a URI reference as defined in [RFC 2396].
Percent escapes (e.g. ``%2f``) should preferably
be encoded in lower case.

Should? Ouch... better not leave any choices here.

Addressed by simply using the Unicode form in serialization. I misinterpreted the N3 docs on this count. Unescaping would have been extremely hairy anyway.

Your other comments are addressed in the PEG.

- Benja





reply via email to

[Prev in Thread] Current Thread [Next in Thread]