[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[ff-cvs] fenfire/docs/pegboard/swamp_easier--benja peg.rst
From: |
Benja Fallenstein |
Subject: |
[ff-cvs] fenfire/docs/pegboard/swamp_easier--benja peg.rst |
Date: |
Sat, 27 Sep 2003 13:02:57 -0400 |
CVSROOT: /cvsroot/fenfire
Module name: fenfire
Branch:
Changes by: Benja Fallenstein <address@hidden> 03/09/27 13:02:57
Modified files:
docs/pegboard/swamp_easier--benja: peg.rst
Log message:
address issues
CVSWeb URLs:
http://savannah.gnu.org/cgi-bin/viewcvs/fenfire/fenfire/docs/pegboard/swamp_easier--benja/peg.rst.diff?tr1=1.2&tr2=1.3&r1=text&r2=text
Patches:
Index: fenfire/docs/pegboard/swamp_easier--benja/peg.rst
diff -u fenfire/docs/pegboard/swamp_easier--benja/peg.rst:1.2
fenfire/docs/pegboard/swamp_easier--benja/peg.rst:1.3
--- fenfire/docs/pegboard/swamp_easier--benja/peg.rst:1.2 Mon Sep 22
02:04:43 2003
+++ fenfire/docs/pegboard/swamp_easier--benja/peg.rst Sat Sep 27 13:02:57 2003
@@ -26,8 +26,54 @@
requested it.
-.. Issues
- ======
+Issues
+======
+
+- Should we keep the current methods, and just add those
+ proposed in this PEG? There is a lot of code using the
+ current methods; we could just deprecate them for now.
+
+ RESOLVED: No. The point is to *simplify* the API;
+ adding more variants doesn't do that.
+
+ Deprecating the current methods but not changing the code
+ that uses them adds to the confusion, rather than making
+ that code simpler.
+
+ (I have volunteered to change the existing code
+ if this PEG is accepted.)
+
+- What should happen in ``getObject()`` etc.
+ if there is more than one triple of the requested form?
+
+ RESOLVED: Do the same as currently: throw
+ ``NotUniqueException``. There are some problems
+ associated with that (see mailing list discussions),
+ but they are out of scope for this PEG.
+
+- What should be the name of the method returning
+ a ``TripleIter``? ``get()``, for symmetry with
+ the Collections API and the other functions;
+ ``find()``, similar to what we have now; or
+ ``query()`` for similarity with e.g. Aaron Swartz'
+ Python API for RDF?
+
+ RESOLVED: ``find()``. Tuomas explains:
+
+ I feel better about ``find()``, since it
+
+ 1. feels lighter than query
+ 2. feels heavier than get, as it should - we don't *necessarily*
+ have all indices ready.
+
+- Should you be able to query just subjects, i.e. ignoring objects,
+ having them ``null`` in ``TripleIter`` and not getting duplicates?
+
+ RESOLVED: No-- this is what ``getSubjects()`` etc. is for;
+ working with a ``Set`` is more useful and consistent in these cases
+ than working with a ``TriplesIter`` (and having one of its elements
+ ``null``, i.e. not really iterating through *triples*, etc.).
+
A flavor of the API
===================
@@ -36,8 +82,8 @@
through a set of triples. I propose the following
interface::
- for(Triples t = graph.get(_, RDF.type, _); t.loop();) {
- System.out.println(t.sub+" is instance of "+t.ob);
+ for(TripleIter i = graph.get(_, RDF.type, _); t.loop();) {
+ System.out.println(i.subj+" is instance of "+i.obj);
}
I.e., have our own iterator-like thing, which iterates
@@ -59,9 +105,9 @@
when efficiency is at a premium. (Then again, when I print
to the console inside the loop, efficiency isn't at a
premium anyway... but whatever...) The *fast* version
-would look like this::
+would look like this [#speed]_::
- for(Triples t = graph.get_A1A(RDF.type); t.loop()) {
+ for(TripleIter t = graph.find_X1X(RDF.type); t.loop();) {
System.out.println(t.sub+" is instance of "+t.ob);
}
@@ -70,7 +116,7 @@
In Jython, the loop would look like this::
- t = graph.get(_, RDF.type, _)
+ t = graph.find(_, RDF.type, _)
while t.loop():
print "<%s> is instance of <%s>" % (t.sub, t.ob)
@@ -84,7 +130,7 @@
We'll make it a convention that classes using the API
have this at the top::
- static final _ = null;
+ static final Object _ = null;
You don't have to have this, but it makes things easier to read.
@@ -92,8 +138,8 @@
``ConstGraph``
--------------
-``ConstGraph`` shall have the following API
-for getting triples::
+The current methods for finding triples shall be removed
+from ``ConstGraph`` and be replaced by the following API::
/** Get an iterator through all triples in the graph
* matching a certain pattern.
@@ -102,11 +148,11 @@
* If any of the parameters is <code>null</code>,
* any node will match it.
*/
- Triples get(Object subject, Object predicate, Object object);
+ TripleIter find(Object subject, Object predicate, Object object);
// Versions that don't allow wildcards (``null``)
- Triples get_AA1(Object predicate, Object object);
- Triples get_1A1(Object subject, Object object);
+ TripleIter find_XX1(Object predicate, Object object);
+ TripleIter find_1X1(Object subject, Object object);
...
/** Get the subject of the triple matching a certain pattern.
@@ -116,10 +162,13 @@
* any node will match it.
* @returns The subject of the triple, if there is one,
* or <code>null</code> if there is no such triple.
+ * @throws NotUniqueException if there is more than one
+ * matching triple in the graph.
*/
- Object getSubject(Object subject, Object predicate, Object object);
+ Object getSubject(Object subject, Object predicate, Object object)
+ throws NotUniqueException;
- Object getSubject_A1A(Object predicate);
+ Object getSubject_X1X(Object predicate) throws NotUniqueException;
...
Note: The reason for having ``subject`` as a parameter
@@ -135,45 +184,57 @@
* If any of the parameters is <code>null</code>,
* any node will match it.
* <p>
- * The set is immutable; it is <em>not</em> backed
- * by the graph (i.e., changing the graph does not
- * change the set.)
+ * The set is backed by the graph (i.e., changing the graph
+ * changes the set, e.g. if the last triple with a given
+ * subject is removed from the graph, that subject
+ * disappears from the set). The set is <em>not</em> modifiable
+ * (e.g. the <code>add()</code> and <code>remove()</code> methods
+ * throw <code>UnsupportedOperationException</code>).
*/
Set getSubjects(Object subject, Object predicate, Object object);
-(Backing is harder to program and I don't see the pay-off,
-since the ``getXXXs`` functions won't be used that often.)
+Backing is generally used in the Collections API, and allows
+for lighter implementations of the method. For example,
+when using ``new TreeSet(graph.getSubjects(_, _, _))`` to get
+a *sorted* set of all subjects in a graph, it would be quite
+wasteful if ``getSubjects()`` created a ``HashSet`` only to have
+it discarded after being used in the constructor of ``TreeSet``.
- Set getSubjects_AA1(Object object);
+ Set getSubjects_XX1(Object object);
...
// getObject(), getObjects() similarly
- // getPredicate(), getPredicates() similarly
+ // getPredicates() similarly
-``getPredicate()`` is essentially useless, but we'll have it
-for symmetry. ``getPredicates()`` is useful, mostly for
+``getPredicate()`` is essentially useless, so we don't
+have it. This is symmetric with not having ``setPredicate()``,
+below. (If you need something to the same effect,
+you can use ``find()`` manually.)
+
+``getPredicates()`` is useful, mostly for
getting *all* predicates used in a graph.
-Note that we don't have ``X`` in the function variants
-any more, just ``1`` and ``A``, with ``A`` being equivalent
+Note that we don't have ``A`` in the function variants
+any more, just ``1`` and ``X``, with ``X`` being equivalent
to passing ``null`` in that position to the generic method.
-(E.g., ``getSubjects_AAA()`` is equivalent to
+(E.g., ``getSubjects_XXX()`` is equivalent to
``getSubjects(_, _, _)``, returning the set of all subjects
in the graph.)
-``Triples``
------------
+``TripleIter``
+--------------
-For the API of the iterator-like object, ``Triples``,
+For the API of the iterator-like object, ``TripleIter``,
see ``swamp_easier_iteration--benja``.
``Graph``
---------
-For changing graphs, the following API shall be used::
+The current methods for adding, changing and removing triples
+shall be removed from ``Graph`` and replaced by::
/** Add a triple to this graph. */
void add(Object subject, Object predicate, Object object);
@@ -186,8 +247,8 @@
*/
void remove(Object subject, Object predicate, Object object);
- void remove_A1A(Object predicate);
- void remove_1AA(Object subject);
+ void remove_X1X(Object predicate);
+ void remove_1XX(Object subject);
...
/** Replace all triples with the given predicate and object
@@ -219,6 +280,33 @@
I believe this API will be substantially simpler to use
than the one we have at the moment, and not lose
anything w.r.t. speed. In fact, it may speed things up
-in the future, because we can cache the ``Triples`` objects.
+in the future, because we can cache the ``TripleIter`` objects.
+
+\- Benja
+
-\- Benja
\ No newline at end of file
+.. [#speed] The speed difference between ``find(_, RDF.type, _)``
+ and ``find_X1X(RDF.type)`` is that ``find()`` has to check
+ for ``null`` in each of the arguments (that's three ``jnz``
+ instructions) and do one method call. (If we can get the compiler
+ to inline the ``find_XXX()`` variants, the method call goes away.)
+ This may actually be fine even in an inner loop. (The
+ hashtable lookups inside the loop will probably not be as cheap!)
+
+ One might think that all fields of ``TripleIter``
+ (``subj``, ``pred``, ``obj``) need to be fetched for each
+ iteration, but that's actually not true: Only those that are
+ different from the previous iteration need to be fetched.
+ (The implementation of the iterator can easily know
+ which those are.)
+
+ The only situation where this makes a speed difference
+ is something like::
+
+ for(TripleIter i = graph.find(_, RDF.type, _); i.loop();) {
+ System.out.println("Has an rdf:type: "+i.subj);
+ }
+
+ where fetching the ``obj`` each time is superfluous.
+ This situation is not expected to be frequent enough
+ to be a problem.
\ No newline at end of file