[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Gzz-commits] manuscripts/storm article.rst
From: |
Benja Fallenstein |
Subject: |
[Gzz-commits] manuscripts/storm article.rst |
Date: |
Sun, 02 Feb 2003 22:33:49 -0500 |
CVSROOT: /cvsroot/gzz
Module name: manuscripts
Changes by: Benja Fallenstein <address@hidden> 03/02/02 22:33:49
Modified files:
storm : article.rst
Log message:
Xanalogical storage explained ;-)
CVSWeb URLs:
http://savannah.gnu.org/cgi-bin/viewcvs/gzz/manuscripts/storm/article.rst.diff?tr1=1.71&tr2=1.72&r1=text&r2=text
Patches:
Index: manuscripts/storm/article.rst
diff -u manuscripts/storm/article.rst:1.71 manuscripts/storm/article.rst:1.72
--- manuscripts/storm/article.rst:1.71 Sat Feb 1 22:45:37 2003
+++ manuscripts/storm/article.rst Sun Feb 2 22:33:49 2003
@@ -1,16 +1,6 @@
-============================================================================
-Gzz Storm: Supporting data mobility through location independent identifiers
-============================================================================
-
-(an-other way (too buzzwordy? but generalized!): enabling distributed
-mobile hypermedia with location independent unique document identifiers)
-['distributed mobile hypermedia' is too limited --b.]
-[Perhaps 'Storm: Supporting data mobility through location independent
-identifiers' is enough ? We could mention in the text (as we do mention ;),
-that Storm is used also in our Gzz project. Do we really bind name 'Storm' to
name
-'Gzz' in the main title ? This may have psychological effects: reader
-might first think that Storm can only be used with Gzz. And this is not
-true. -Hermanni]
+========================================================================
+Storm: Supporting data mobility through location independent identifiers
+========================================================================
1. Introduction
===============
@@ -63,6 +53,16 @@
systems (location independent identifiers, immutable block storage, *working*
links etc.)
-use of p2p architecture in hypermedia domain
+Gzz provides a platform to build hypermedia applications upon.
+So far, we have only used Storm in our experimental
+hypermedia system, Gzz. No work on integrating Storm
+with current programs (in the spirit of Open Hypermedia)
+has been done so far. It is not clear how far this is possible
+without changing applications substantially, if advantage
+of our implementation of Xanalogical storage is to be taken.
+(Vitali [ref] notes that Xanalogical storage necessiates
+strong discipline in version tracking, which current systems lack.)
+
This paper is structured as follows. In next section, we describe
related work. In section 3, we introduce the basic storage unit of our
system, file-like blocks of data identified by cryptographic hashes.
@@ -344,7 +344,91 @@
4. Xanalogical storage
======================
-Xanalogical storage, pioneered by Project Xanadu [ref],
+In the xanalogical storage model [ref],
+pioneered by the unfinished Project Xanadu [ref],
+links are not between documents, but individual characters.
+When a character is first typed in, it acquires a permanent id
+("the character 'D' typed by Janne Kujala on 10/8/97 8:37:18"),
+which it retains when copied to a different document, distinguishing
+it from all similar characters typed in independently [#]_.
+A link is shown between any two documents containing the characters
+that the link connects. Xanalogical links are external and bidirectional.
+
+.. [#] Xanalogical storage is not limited to text. We speak about
+ *characters* because it simplifies the explanation; pixels
+ or frames of video could be substituted.
+
+In addition to content links, xanalogical storage keeps an index of
+transclusions: identical characters copied into different documents.
+Through this mechanism, the system can show to the user all documents
+that share text with the current document.
+
+To keep track of links and transclusions, the system keeps a global index
+of documents by the characters they contain, and of links by the characters
+they refer to. Thus, for each character in the document, the system
+queries the index for other documents containing this character,
+and shows them as transclusions. Resolving links is a multi-step process.
+Each link is modeled as two collections of characters: the two
+endpoints of the link. To show links to a document,
+the system firstly uses the link index to find links
+to each character in the documment. Secondly, for each link,
+it looks at the *other* set of characters in the link-- the target
+of the link, if the original character was the source, and vice versa.
+Thirdly, it looks for documents containing these target characters.
+This way, even if both the source and target cjaracters of the link
+are moved to a different document, the link stays connected to them.
+
+Of course, doing any expensive operation for *every* character
+in a document does not scale very well. In practice,
+characters typed in consecutively are given consecutive ids,
+such as ``...:4``, ``...:5``, ``...:6`` and so on, and
+operations are on *spans*, consecutive ranges of characters
+(``...:4-6``). In Storm, in each editor session we create a
+block with all characters entered in this session (the content type
+being ``text/plain``). To designate a span of characters
+from that session, we use the block's id, the offset of the first
+character, and the number of characters in the span.
+This technique was first introduced in [ref ht02 paper].
+
+In Xanadu, characters are written to append-only *scrolls*
+when they are typed [ref]. Because of this, we call the blocks
+containing the actual characters *scroll blocks*. The documents
+do not actually contain the characters; instead, they are
+*virtual files* containing span references as described above.
+To show a document, the scroll blocks it references are loaded
+and the characters retrieved from there [#]_.
+
+.. [#] It is unclear whether this approach is efficient for text
+ in the Storm framework; in the future, we may try storing
+ the characters in the documents themselves, along with their
+ permanent identifiers. For images or video, on the other hand,
+ it is clearly beneficial if content appearing in different
+ documents-- or different versions of a document-- is only
+ stored once, in a block only referred to wherever
+ the data is transcluded.
+
+Our current implementation shows only links between documents
+that are in memory at the same time [screenshot of xupdf].
+In the future, we will implement a global index atop of
+a distributed hashtable, with the scroll blocks' ids as the keys.
+To find the transclusions of a span, the system will retrieve
+all transclusions of any span in the scroll block, then
+sort out those that do not overlap the span in question.
+
+Since the problem is to search for overlapping ranges,
+the spans cannot be used as hashtable keys. However, as the blocks
+will be relatively small (limited by the amount of text
+the user enters between two saves of a document), we hope
+that this will not be a major scalability problem. Otherwise,
+systems that allow range queries, such as skip graphs [ref],
+may prove useful.
+
+One question raised by xanalogical storage is which links to show
+for a popular document that has been linked to by many users.
+We hope to address this problem by collaborative filtering
+of links [explain, ref]. There has been research on
+collaborative filtering in peer-to-peer systems
+without compromising participants' privacy [ref John Canny].
5. Indexing
- Re: [Gzz-commits] manuscripts/storm article.rst, (continued)
[Gzz-commits] manuscripts/storm article.rst, Benja Fallenstein, 2003/02/01
[Gzz-commits] manuscripts/storm article.rst, Toni Alatalo, 2003/02/01
[Gzz-commits] manuscripts/storm article.rst, Benja Fallenstein, 2003/02/01
[Gzz-commits] manuscripts/storm article.rst,
Benja Fallenstein <=
[Gzz-commits] manuscripts/storm article.rst, Benja Fallenstein, 2003/02/02
[Gzz-commits] manuscripts/storm article.rst, Hermanni Hyytiälä, 2003/02/03
[Gzz-commits] manuscripts/storm article.rst, Hermanni Hyytiälä, 2003/02/03
[Gzz-commits] manuscripts/storm article.rst, Hermanni Hyytiälä, 2003/02/03
[Gzz-commits] manuscripts/storm article.rst, Hermanni Hyytiälä, 2003/02/03
[Gzz-commits] manuscripts/storm article.rst, Toni Alatalo, 2003/02/03
[Gzz-commits] manuscripts/storm article.rst, Benja Fallenstein, 2003/02/03
[Gzz-commits] manuscripts/storm article.rst, Toni Alatalo, 2003/02/04
[Gzz-commits] manuscripts/storm article.rst, Toni Alatalo, 2003/02/04
[Gzz-commits] manuscripts/storm article.rst, Toni Alatalo, 2003/02/04