Re: [Gzz] ``canon3_file_format``: A canonical, N3-based file format

gzz-dev

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Gzz] ``canon3_file_format``: A canonical, N3-based file format

From:	Tuomas Lukka
Subject:	Re: [Gzz] ``canon3_file_format``: A canonical, N3-based file format
Date:	Wed, 2 Apr 2003 21:16:18 +0300
User-agent:	Mutt/1.4.1i

On Wed, Apr 02, 2003 at 07:38:46PM +0200, Benja Fallenstein wrote:
> >>>>- URIrefs are compared character-by-character,
> >>>>in the form as defined in [RFC 2396]
> >>>>(i.e., *after* Unicode characters outside
> >>>>the ASCII range have been escaped).
> >>>>Characters are compared by Unicode code point
> >>>>value.
> >>>
> >>>Is this the same as a lexicographic string comparison
> >>>of the UTF-8 encoded one?
> >>
> >>I don't know.
> >
> >Need to explain how to compare. I couldn't write a program yet.
> 
> In Java: string1.compareTo(string2), on the in-memory representation as 
> used by Jena.

Are you positive that this really does the right thing?

This should be mentioned in the PEG as well.

> The full writer algorithm looks something like this:
> 
> - Get all Statements from the Model.
> - Put them into a SortedSet. Normalization into NFC is done at this 
> stage (note to self: find out how this works in Java.) The comparator 
> uses the algorithm specified in the PEG, using Java compareTo to compare 
> any strings.
> - Create an UTF-8 writer and write the header to it.
> - Write each statement in order. Escaping of literals is done at this stage.

Could be nice as a PEG appendix

        Tuomas

[Prev in Thread]

Current Thread

[Next in Thread]

[Gzz] ``canon3_file_format``: A canonical, N3-based file format, Benja Fallenstein, 2003/04/01
- Re: [Gzz] ``canon3_file_format``: A canonical, N3-based file format, Antti-Juhani Kaijanaho, 2003/04/02
  - Re: [Gzz] ``canon3_file_format``: A canonical, N3-based file format, Benja Fallenstein, 2003/04/02
- Re: [Gzz] ``canon3_file_format``: A canonical, N3-based file format, Tuomas Lukka, 2003/04/02
  - Re: [Gzz] ``canon3_file_format``: A canonical, N3-based file format, Benja Fallenstein, 2003/04/02
    - Re: [Gzz] ``canon3_file_format``: A canonical, N3-based file format, Tuomas Lukka, 2003/04/02
    - Re: [Gzz] ``canon3_file_format``: A canonical, N3-based file format, Benja Fallenstein, 2003/04/02
    - Re: [Gzz] ``canon3_file_format``: A canonical, N3-based file format, Tuomas Lukka <=
- Re: [Gzz] ``canon3_file_format``: A canonical, N3-based file format, Tuukka Hastrup, 2003/04/02
  - Re: [Gzz] ``canon3_file_format``: A canonical, N3-based file format, Benja Fallenstein, 2003/04/02
    - Re: [Gzz] ``canon3_file_format``: A canonical, N3-based file format, Tuomas Lukka, 2003/04/02
    - Re: [Gzz] ``canon3_file_format``: A canonical, N3-based file format, Tuukka Hastrup, 2003/04/03
    - Re: [Gzz] ``canon3_file_format``: A canonical, N3-based file format, Benja Fallenstein, 2003/04/03

Prev by Date: Re: [Gzz] Canon3 PEG repost
Next by Date: [Gzz] Tomorrow to jkl
Previous by thread: Re: [Gzz] ``canon3_file_format``: A canonical, N3-based file format
Next by thread: Re: [Gzz] ``canon3_file_format``: A canonical, N3-based file format
Index(es):
- Date
- Thread