[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Gzz] Structure proposal: RDF (+Xu)
From: |
Benja Fallenstein |
Subject: |
[Gzz] Structure proposal: RDF (+Xu) |
Date: |
Sun, 16 Feb 2003 15:07:52 +0100 |
User-agent: |
Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.2.1) Gecko/20021226 Debian/1.2.1-9 |
Hi all,
I didn't want to spend time to write this up during the article writing,
but I thought a lot about RDF as our new structure last week.
First, our requirements, in a nutshell:
- A really simple metastructure for defining applitude structures
- Support orthogonal structure (a person can be in a birthday and a
calendar applitude simultaneously)
- Support a useful 'basic' view (the structure can directly be viewed
and edited, using the mouse or the keyboard, and for many concerns this
is a useful tool in itself); exploration of the view is bidirectional
- The cursors-and-slices problem should be solved
- Support defining consistency rules (Tuomas')
- Support xanalogical content
(Probably I forgot one or two, this is what I could think about right now.)
The RDF model is as follows. (I'm structuring into 'steps' for easier
understandability-- step 3 is the real model.)
Step 1. A directed labelled graph where each node is a URI (actually,
URIref-- it can contain a fragment identifier), and each arc is labelled
by a URI. In other words, a set of triples of URIs. Triples are
interpreted as [subject, predicate, object].
Step 2. In addition to step 1, the objects of triples can be *literals*
instead of URIs. A literal is a Unicode string, with an optional
datatype (URIref) and an optional language attribute ('de', 'fi' etc).
Step 3. In addition to step 2, the subjects and objects of triples can
be *blank nodes*. A blank node is like a URIref, but local to a graph
(if you join two graphs, the blank nodes in each one are different).
(Step 3 is the most unfortunate for our purposes, but that's life. In
RDF, they are existential qualifiers-- "there is a resource for which
these assertions hold.")
RDF can meet the requirements I listed as follows.
- The structure can be explained in three short paragraphs, as above,
and you can build anything out of it.
- You can use any predicate with any subject/object-- orthogonal
structure. (This is intended.)
- Cursors will be implemented as [x, accurses, y]. When graphs are
joined (using simple set union), the "right thing" happens.
- There are a number of languages for defining consistency rules at
different complexities, including RDF Schema and the Web Ontology
Language (OWL). These allow defining cardinalities, domains and ranges
of properties, classes of things, etc. You can get pretty complex if you
want to. You can also use RDF without any of these if you want to.
They're all written in RDF themselves.
- To cater for Xu content, we should probably define a type of literal.
Making all literals xu content would be embrace&extend, and also not
flow with the idea that you can store literals as their 'real' values
(store the integer 755 instead of the string '755' etc.). Supporting &
advocating Xu content, but remaining compatible, is a better strategy
than stubbornly insisting on it and driving people away from our system,
accordingly.
Tuomas has talked about putting constraints into Storm blocks, so that
structures can tell which rules they follow. This flows *very* well with
RDF: The recommendation is to assign permanent URIs to versions of
schemas/ontologies, so that RDF graphs can declare which version they
adhere to. Same idea (we just need a URN namespace for Storm blocks). We
may even be able to convince some RDF folks that Storm blocks are a
better kind of URI for this than HTTP, esp. once we have built the p2p
tools for resolving them globally.
Viewing and editing takes a little more work. On the other hand, here's
something we have to offer to the RDF community: While there are
graphical RDF editors out there (I said I didn't think there were decent
ones on IRC the other day-- I was wrong of course), I don't think
there's a focus+context one. This has a lot of scalability to offer over
editors trying to map RDF graphs to a 2D plane.
Ok, now, how? I see several views--
0. The pure structure view
This view shows URIrefs as ellipses containing the URIref, blank nodes
as empty ellipsis, and literals as rectangles with the literal text in
them. You select a number of properties to view at any time (like the
dimension rose in zz, but simply a set). There's a cursor position, the
focused node, shown in the middle of the screen.
Around this we can show the directly connected nodes in a wheel (like in
Asko's mind- and my notemap star views). Backward connections (tuples
with the focused node as object) will always be shown to the left,
forward connections (focused node as subject) will always be shown to
the right (see xupdf article for reasons:).
In addition to the focus, we keep a selected node. This is shown
directly right of the focused note (if connected poswards) or directly
left (if connected negwards). The up/down cursor keys change the
selected node, rotating the wheel. The left/right cursor keys move to
the node shown left/right of the focused node. The currenly focused node
then becomes the selected node, so you can go back pressing the inverse
cursor key.
A nice thing to note about this structure/view is that it even supports
the genealogy applitude: Given an 'isChild' property, a person's parents
are shown left of them, and a person's children right of them.
An alternative view is to make vertical lists of the forward/backward
connections, like this::
neg1 \
neg2 \ / pos1
neg3 - focus - pos2
neg4 / \ pos3
neg5 /
We keep a selected node as before, and adjust the lists vertically as
the selected node is changed using the up/down keys.
Key bindings for making connections need thinking...
1. The basic view (pure structure + cell views + sorting)
The fundamental view isn't very useful except for debugging, because the
URIrefs are really internal things and I don't want to see a urn-5 when
I look at the node representing, say, a person.
Therefore we need cell views that select the content to show inside a
node. This needs a bit of knowledge about the things to be shown; e.g.,
for people we usually show their name (firstname + " " + lastname, say).
Some things may be simply identified by a rdf:label (a property that has
a human-readable label for a node as its object), but many things won't
be, as we want to represent the actual *semantics* here and a person's
name is something more specific than a 'label' for them.
This setting may simply be:
- A number of properties to be tried in order. The first property that
has a value for a node is used to determine the text shown in this node.
- A property to be used for editing. When I hit 'Tab' (or whichever key
we'll use), this property's value for this node is edited. E.g. we may
not show a name because no name has been entered yet, but when I hit Tab
and start typing, what I type becomes the name.
We need the same two settings for languages-- a list of languages to be
tried in order, plus a language to be changed when we hit 'Tab.'
Additionally, we'd want to be able to impose an order on the nodes
shown. Intrinsically, "the triples with subject X and any of the
properties {A,B,C}" do not have an order, obviously, so we'll order them
somehow arbitrarily, e.g. alphabetically by URIref. It is practical if
we can order people by first name, for example, or by age, or by last
name and then by age.
This view is useful for many purposes.
2. Views for specific structures
We also need special views to show lists well, and table views: select a
set of nodes (rows) and a set of properties (columns) and show a table
that can be sorted in different ways. These views are still generic, not
for a specific applitude.
We may even be able to build some mildly zz-like views without
infringing the patent, assigning different screen directions to
different properties. Look at this figure from the RDF primer:
http://www.w3.org/TR/rdf-primer/#figure16
Now, if RDF meets our requirements, why would we choose it over other
alternatives?
The strongest reason is, **it has users**. It is a W3C standard. There
are implementations. With zzstructure, at least we had the zz community;
if we roll our own structure, we have nobody else using it. RDF is
already being used for many purposes, there are defined data formats
(ontologies), there are editors. (Next time the client doesn't start,
use W3C's IsaViz! Or one of the other Free Software RDF editors out
there.) There are parsers. Etc. pp. Provided that it meets our
requirements, not using RDF has a smell of 'not invented here' ;-)
Next, it supports associative linking well. Notes in zzstructure tend to
become trees, as that's what's easy to do in the structure. When I
seriously started taking notes in zzstructure (2 1/2 years ago now), I
invented a rather weird 'link' pattern to make horizontal links between
branches of a tree in two-dimensional zzstructure-- I like it :), but
typed m:n links would have been more appropriate, I think. For untyped
m:n links, RDF has a predefined 'hasToDoWith' property. For typed links,
users can make up their own properties. -- RDF and the view proposed
above support trees well: as a special kind of m:n linking. If you want
to add a second 'parent' to a node, you can trivially do so.
Next, it's relatively easy to do virtual structures on top of RDF. As
RDF people have logical inference in the backs of their heads, some
existing APIs for RDF even support this explicitly. Virtual "cell
contents" (i.e., literal values) and virtual connections between URIrefs
work the same way (they're both virtual triples). I would like Gzz to be
what you could call a 'semantic spreadsheet': You can make formulas for
semantic relationships as well as 'numbers in cells,' so when you change
something in the structure, the computed connections change. This is
harder in e.g. zzstructure. (I think this is a very important idea, but
I'm having difficulties to explain it well here. Will have to do a
seperate writeup.)
I guess that's enough for today. Comments, please!
- Benja