dotgnu-visionaries
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Visionaries] User data storage


From: Peter Minten
Subject: [Visionaries] User data storage
Date: Tue, 15 Jul 2003 17:38:15 +0200
User-agent: Mozilla/5.0 (Windows; U; Win 9x 4.90; en-US; rv:1.4) Gecko/20030529

Hi folks,

it's nice if the data of users can be stored in a standard, but simple way. That
means no 1000 file formats for 1000 webservices. Different webservices need to
store different data however, but that's not a reason to reinvent the syntax
wheel time and time again.

Futhermore, since we're trying to give the user freedom the standard syntax
should be easily readable so that the user knows what's on his/her system.

The most flexible form of data storage is probably a graph, with the vertices
being data nodes. RDF is modeled around this concept.

The most readable structured form of data storage is AFAIK YAML. YAML provides
quite easy semantics but still allows for many kinds of structured data to be
used, it even allows you to set the type of data manually.

As I've (kinda) proven before (in mail 'RDF/YAML' to this list) RDF can be
expressed in YAML. Granted, YAML is still an emerging technology and is not yet
standardized on stuff like namespaces (which are heavily used in RDF/XML), but I
have no doubt that that will come soon enough.

RDF can also be stored in many other ways however, including in memory, in n3
notation or in RDF/XML. But for writing down RDF to a file YAML is usually best
IMHO.

Now how does this fit into the DotGNU system? Many webservices will want to
store the documents of a user on a remote medium, the users harddrive or a
virtual harddrive [1]. But, we want to make DotGNU easy for the user and for the
programmers, so it would be nice if a document could easily be found and loaded.

Now add webservice interoperabilty. To avoid a format mess you need clearly
defined formats with data chunks that have clearly defined meanings. RDF is
perfect for that. Thus if all documents are in RDF interoperabilty would greatly
increase.

Security is also an important issue. I wouldn't give a webservice full access to my homedir, even if it was necessary to save a file. Instead I'd rather have my DGEE download the file and save it. In this case the trusted DGEE on my computer would do the saving. The key element there is btw that the webservice can send data, but the user must tell the DGEE if and where the data should be stored.

Another approach to the security problem is sandboxed directories for each webservice with configurable access for webservices. You can do this with normal directories and normal files, but a special directory system and RDF/YAML files makes it work better.

The idea of the RDF/YAML data storage (which I dub DGFS to save my keyboard :-) is that you have a set of administrative index files and a set of data files. The administrative files are:
* Index files (what documents where created by which webservice?, etc)
* Permission files (who can access these files?)

Additionally the DGFS could protect against data lossage by making the DGFS a CVS repository. Granted, CVS technology increases the load and save times, but those are not too great anyway due to the transmission over the internet. The downside is of course the required diskspace, though the absurd huge harddrives of these days make that a less powerful argument. I for one would be glad if an override of my files would be revertable.

So in conclusion the DGFS could:
* Decrease search times by providing indices.
* Increase interoperability between webservices.
* Protect files against unwanted reading.
* Save files from deletion.

Greetings,

Peter

[1] It's usually not desirable to store user document on the server webservice
application, since the user wants a clear oversight of his/her data and the
webservice provider doesn't want to waste diskspace. Thus the data should be
stored on the harddrive of the user, on  a floppy, zip or whatever or on a
virtual harddrive. I consider a virtual harddrive the best option since it's
accessible from anywhere.




reply via email to

[Prev in Thread] Current Thread [Next in Thread]