gzz-commits
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Gzz-commits] gzz/Documentation/misc storm-urn-application.txt


From: Benja Fallenstein
Subject: [Gzz-commits] gzz/Documentation/misc storm-urn-application.txt
Date: Wed, 22 Jan 2003 10:22:26 -0500

CVSROOT:        /cvsroot/gzz
Module name:    gzz
Changes by:     Benja Fallenstein <address@hidden>      03/01/22 10:22:26

Modified files:
        Documentation/misc: storm-urn-application.txt 

Log message:
        Update. Please read & comment now!

CVSWeb URLs:
http://savannah.gnu.org/cgi-bin/viewcvs/gzz/gzz/Documentation/misc/storm-urn-application.txt.diff?tr1=1.1&tr2=1.2&r1=text&r2=text

Patches:
Index: gzz/Documentation/misc/storm-urn-application.txt
diff -u gzz/Documentation/misc/storm-urn-application.txt:1.1 
gzz/Documentation/misc/storm-urn-application.txt:1.2
--- gzz/Documentation/misc/storm-urn-application.txt:1.1        Sun Jan 12 
08:10:48 2003
+++ gzz/Documentation/misc/storm-urn-application.txt    Wed Jan 22 10:22:26 2003
@@ -23,7 +23,7 @@
 
    Identifiers in this namespace represent Storm blocks, 
    immutable byte sequences with MIME-like headers containing 
-   a content type and optional additional metadata. 
+   a content type and optional additional metadata [MIME]. 
    (Any byte sequence with a content type can be represented
    as a Storm block.) Storm is a data storage system under development
    by the registrant. Use of the namespace for any immutable data
@@ -33,14 +33,11 @@
 
       urn:urn-n:block:00<hex-bytes><sha-1>
       urn:urn-n:block:01<sha-1>
-      urn:urn-n:block:02<sha-1>:<content-type>
-      urn:urn-n:block:02<sha-1><content-type-bytes>
+      urn:urn-n:block:02<sha-1>
 
    where 'urn-n' is the namespace ID assigned by IANA;
-   <sha-1> is a SHA-1 hash represented in hexadecimal form;
-   <hex-bytes> is an arbitrary sequence of at most 100 bytes in hex;
-   <content-type> is a MIME content type; and <content-type-bytes>
-   is a MIME content type, represented in hexadecimal form.
+   <sha-1> is a SHA-1 hash represented in hexadecimal form [SHA-1]; and
+   <hex-bytes> is an arbitrary sequence of at most 100 bytes in hex.
    The first two digits after 'block:' serve as a version number,
    allowing future versions of this registration to introduce
    additional variants.
@@ -53,29 +50,24 @@
    Identifiers of the 02 variant are designed for use with data
    in existing systems, possibly already indexed by its SHA-1 hash
    (for example, in a file sharing system). In identifiers
-   of this variant, <sha-1> is the SHA-1 hash of the block's body; 
-   the block's header, which is not hashed, is defined to be
-   the string "Content-Type: ", followed by a normalized form
-   of <content-type> (see below), followed by CRLF, the string
-   "Content-Transfer-Encoding: binary", followed by CRLF CRLF.
-   If the <content-type-bytes> form is used instead of the
-   <content-type> form, <content-type-bytes> is converted
-   from its hexadecimal representation into a sequence of bytes
-   which is then used as the <content-type>.
-
-   To normalize <content-type>, XXX
+   of this variant, <sha-1> is the SHA-1 hash of the block's header.
+   The header must contain a "Storm-Content-Hash" header field
+   whole value is the SHA-1 hash of the block's body, in hex.
 
    Identifiers off the 00 variant are provided for backward 
    compatibility only; applications must not generate new blocks
-   with 00 ids. The <sha-1> of a 00 block is computed as follows:
+   with 00 ids. The <sha-1> part of a 00 block identifier 
+   is computed as follows:
 
-   - Let n be the length of <hex-bytes> plus one.
+   - Let n be the length of the byte sequence represented
+     by <hex-bytes>, plus one.
    - Represent n as a four-byte big-endian long integer.
      (Since n <= 101, the first three bytes will always be zero.)
-   - Concatenate to this a zero byte (eight zero bits).
+   - Concatenate to this a zero octet.
    - Concatenate to this the bytes encoded in <hex-bytes>.
    - Concatenate to this the header and body of the block,
      as with 01 blocks.
+   - Compute the SHA-1 hash of this byte sequence.
 
    Whereever byte sequences are encoded in hexadecimal form,
    both uppercase and lowercase are allowed for the
@@ -86,18 +78,19 @@
    and 02 identifiers will not be changed. This means that
    a valid URI starting with
 
-      urn:<urn-n>:block:01
+      urn:urn-n:block:01
 
    will always have the form described for 01 identifiers, above.
    
    Future versions of this registration may allow identifiers
-   in this namespace that do not start with "urn:<urn-n>:block:"
+   in this namespace that do not start with "urn:urn-n:block:",
    to accommodate to additional aspects of the Storm system
    besides blocks.
 
 Relevant ancillary documentation:
 
-   TBD. (SHA-1, RFC 2822, MIME)
+   The SHA-1 algorithm is defined in [SHA-1]. The basic format
+   of Storm blocks is defined in [MIME], though 
 
 Identifier uniqueness considerations:
 
@@ -105,13 +98,114 @@
    cryptographic hash function, as long as no successful
    attacks are found it can be assumed that no two different
    byte sequences will ever be generated that have the same
-   SHA-1 hash. Therefore, as long as SHA-1 is not broken,
-   no two different Storm blocks will ever be assigned
+   SHA-1 hash. Therefore, it is reasonable to assume
+   that no two different Storm blocks will ever be assigned
    the same identifier in this namespace.
 
-   However, sometimes applications may want to distinguish
-   between two different instances of a byte sequence.
-   In particular, an application may store all of a user's keystrokes
-   from one session in one Storm block, yet 
+Identifier persistence considerations:
+
+   The use of a cryptographic hash in the identifier
+   makes it impossible to reassign an identifier
+   to a different block.
+
+Process of identifier assignment:
+
+   Assignment is completely open. An identifier 
+   is assigned by creating a Storm block (header and body)
+   and computing an identifier of either the 01 or 02 variant,
+   as specified above.
+
+Process for identifier resolution:
+
+   Identifiers may be resolved by any system that allows
+   looking up data by its SHA-1 hash. This method
+   of identification is becoming popular with a number of systems
+   (notably in the area of peer-to-peer file sharing),
+   and the use of identifiers is not intended to be tied
+   in this namespace to any particular one of these systems.
+
+Rules for Lexical Equivalence:
+
+   All identifiers as defined in this registration are
+   entirely case-insensitive; two such identifiers are
+   lexically equivalent if they are equal ignoring case.
+   This rule may not hold for additional types of identifiers 
+   defined in future versions of this registration.
+
+   If a canonical representation of the identifiers
+   defined by this registration is needed, the
+   'urn:urn-n:block:' part should be in lower case,
+   and everything after it in upper case.
+
+   In addition, it should be noted that any two identifiers
+   as defined by this registration refer to the same
+   resource (Storm block) if and only if the identifiers are
+   lexically equivalent, ignoring case. Storm blocks
+   with equal headers and bodies, but different identifiers,
+   are not considered to be the same.
    
+Conformance with URN Syntax:
+
+   It should be noted that "%"-escaping, as defined by [RFC-2141],
+   is not legal in identifiers defined by this registration,
+   as there are no reserved characters that might
+   need to be escaped. This is relevant as it eases comparison
+   for lexical equivalence, as defined above.
+
+Validation mechanism:
+
+   There are two ways in which an identifier as defined
+   by this registration may be validated. Firstly, 
+   the namespace-specific string (nss) of any such identifier 
+   must abide the following ABNF [RFC-2234] grammar:
+
+      nss        = "block:" (id00 / id01 / id02)
+      
+      id00       = "00" *100(hex-byte) sha-1
+      
+      id01       = "01" sha-1
+
+      id02       = "02" sha-1
+
+      sha-1      = 20(hex-byte)
+
+      hex-byte   = HEXDIG HEXDIG
+
+   Secondly, if the data is known whose hash is included
+   in the identifier, it is possible to check whether
+   it really forms a Storm block (only then can the identifier
+   be truly considered valid). A Storm block is a MIME message [MIME]
+   with the following exceptions:
+
+   - All Storm blocks must contain a Content-Type header field.
+   - All Storm blocks must contain a Content-Transfer-Encoding
+     header field with value 'binary'.
+   - Storm blocks may contain a Storm-Content-Hash header field.
+     If present, it must contain the SHA-1 hash of the block's
+     body in hexadecimal representation. Blocks with an identifier
+     of the 02 variant must have a Storm-Content-Hash header field.
+   - In bodies of "text" content types, newlines are represented
+     as LF characters, not as CRLF.
+
+Scope:
+
+   Global.
+
+References:
+
+   [SHA-1]
+      NIST, FIPS PUB 180-1: Secure Hash Standard, April 1995.
+      http://csrc.nist.gov/fips/fip180-1.txt (ascii)
+      http://csrc.nist.gov/fips/fip180-1.ps  (postscript)
+
+   [MIME]
+      Freed, N. and N. Borenstein, "Multipurpose Internet Mail
+      Extensions (MIME) Part One: Format of Intenet Message Bodies", 
+      RFC 2045, November 1996.
+
+      Freed, N. and N. Borenstein, "Multipurpose Internet Mail
+      Extensions (MIME) Part Two: Media Types", RFC 2046, 
+      November 1996.
+
+
 




reply via email to

[Prev in Thread] Current Thread [Next in Thread]