[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Fenfire-dev] PEG swamp_easier--benja: An easier API for Swamp
From: |
Tuomas Lukka |
Subject: |
Re: [Fenfire-dev] PEG swamp_easier--benja: An easier API for Swamp |
Date: |
Mon, 22 Sep 2003 11:22:23 +0300 |
User-agent: |
Mutt/1.5.4i |
I like the idea. Comments below.
Tuomas
On Mon, Sep 22, 2003 at 05:09:44AM +0300, Benja Fallenstein wrote:
>
> .. Issues
> ======
>
> A flavor of the API
> ===================
>
> First of all, we need a good way for iterating
> through a set of triples. I propose the following
> interface::
>
> for(Triples t = graph.get(_, RDF.type, _); t.loop();) {
> System.out.println(t.sub+" is instance of "+t.ob);
> }
This is nice.
ISSUE: Name for that call: get(...)? We have find() so far.
> I.e., have our own iterator-like thing, which iterates
> through a set of *triples*-- rather than nodes-- but doesn't
> need to create objects for every one of these triples.
*VERY* nice.
However,
ISSUE: Name for the iterator-like thing that goes through triples.
"Triples" says it contains several triples while it has only one
at a time. "TripleIterator", "TripleIter", ...?
> For good measure, here's how the above code would look
> in the current API::
>
> for(Iterator i=graph.findN_X1A(RDF.type); i.hasNext();) {
> Object sub = i.next();
> for(Iterator j=graph.findN_11X(sub, RDF.type); j.hasNext();) {
> Object ob = j.next();
> System.out.println(sub+" is instance of "+t.ob);
> }
> }
>
> However, to be fair, my code isn't how it would look
> when efficiency is at a premium. (Then again, when I print
> to the console inside the loop, efficiency isn't at a
> premium anyway... but whatever...) The *fast* version
> would look like this::
Umm, you should note here that the efficiency difference is in the call,
not in the actual code, as get() can be just a set of if clauses
and actually I think that hotspot might be able to handle it.
However, there's another performance difference with the Triples objects
which you haven't mentioned: *all* members need to be fetched each
time.
> for(Triples t = graph.get_A1A(RDF.type); t.loop()) {
> System.out.println(t.sub+" is instance of "+t.ob);
> }
Note: missing a semicolon.
ISSUE: Naming. I'd think find_X1X_Triples would make more sense here.
> Changes
> =======
>
> We'll make it a convention that classes using the API
> have this at the top::
>
> static final _ = null;
static final **Object** _ = null; ?
> You don't have to have this, but it makes things easier to read.
> ``ConstGraph``
> --------------
>
> ``ConstGraph`` shall have the following API
> for getting triples::
>
> /** Get an iterator through all triples in the graph
> * matching a certain pattern.
> * If <code>subject</code>, <code>predicate</code> and/or
> * <code>object</code> are given, the triples must match these.
> * If any of the parameters is <code>null</code>,
> * any node will match it.
> */
> Triples get(Object subject, Object predicate, Object object);
>
> // Versions that don't allow wildcards (``null``)
> Triples get_AA1(Object predicate, Object object);
> Triples get_1A1(Object subject, Object object);
> ...
>
> /** Get the subject of the triple matching a certain pattern.
> * If <code>subject</code>, <code>predicate</code> and/or
> * <code>object</code> are given, the triple must match these.
> * If any of the parameters is <code>null</code>,
> * any node will match it.
> * @returns The subject of the triple, if there is one,
> * or <code>null</code> if there is no such triple.
> */
> Object getSubject(Object subject, Object predicate, Object object);
>
> Object getSubject_A1A(Object predicate);
> ...
ISSUE: If there is more than one?
> Note: The reason for having ``subject`` as a parameter
> for ``getSubject()`` is that it's easier to read. It will
> almost always be "``_``" (i.e., ``null``). It shall work
> consistently, though: If a subject is given, and there is
> such a triple in the graph, return that subject; otherwise,
> return ``null``.
>
> /** Get the subjects of all triples matching a certain pattern.
> * If <code>subject</code>, <code>predicate</code> and/or
> * <code>object</code> are given, the triple must match these.
> * If any of the parameters is <code>null</code>,
> * any node will match it.
> * <p>
> * The set is immutable; it is <em>not</em> backed
> * by the graph (i.e., changing the graph does not
> * change the set.)
> */
> Set getSubjects(Object subject, Object predicate, Object object);
>
> (Backing is harder to program and I don't see the pay-off,
> since the ``getXXXs`` functions won't be used that often.)
>
> Set getSubjects_AA1(Object object);
> ...
>
> // getObject(), getObjects() similarly
> // getPredicate(), getPredicates() similarly
>
> ``getPredicate()`` is essentially useless, but we'll have it
> for symmetry. ``getPredicates()`` is useful, mostly for
> getting *all* predicates used in a graph.
>
> Note that we don't have ``X`` in the function variants
> any more, just ``1`` and ``A``, with ``A`` being equivalent
> to passing ``null`` in that position to the generic method.
>
> (E.g., ``getSubjects_AAA()`` is equivalent to
> ``getSubjects(_, _, _)``, returning the set of all subjects
> in the graph.)
>
>
> ``Triples``
> -----------
>
> The iterator-like object, ``Triples``, shall have
> the following API::
>
> Object sub, pred, ob;
Issue: Names. subj, pred, obj would be more consistent, i.e.
up to the *end* of the second consonant group.
> (These are ``null`` when the object hasn't been
> initialized, i.e., ``next()`` hasn't been called yet.)
>
> /** Advance to the next triple. */
> void next();
>
> /** Whether there are any more triples to iterate through. */
> boolean hasNext();
>
> /** Indicate that this <code>Triples</code> object won't be
> * used any more.
> * This shall only be called by the code that has requested
> * this object from <code>ConstGraph</code> (through
> * <code>.get()</code>). It's purpose is to tell the
> * <code>ConstGraph</code> that it can be re-used for the
> * next <code>get()</code>; <code>ConstGraph</code> can then
> * cache <code>Triples</code> objects, making life easier
> * for the garbage collector.
> * <p>
> * Calling this method is not obligatory. (If you don't,
> * this object will be garbage-collected normally.)
> */
> void free();
>
> boolean loop() {
> if(hasNext()) {
> next();
> return true;
> } else {
> free();
> return false;
> }
> }
>
> The purpose of ``loop()`` is to enable the common loop
> pattern, ::
>
> for(Triples t = graph.get(...); t.loop();) {
> // ...
> }
>
> which would otherwise have to be written as::
>
> Triples t;
> for(t = graph.get(...); t.hasNext(); t.next()) {
> // ...
> }
> t.free();
This should go into the javadoc.
> This isn't just harder to read, it also scopes ``t``
> wrongly. With the ``loop()`` pattern, the scope of ``t``
> is the body of the loop, which is exactly the code
> executed before ``free()`` is called.
>
>
> ``Graph``
> ---------
>
> For changing graphs, the following API shall be used::
>
> /** Add a triple to this graph. */
> void add(Object subject, Object predicate, Object object);
>
> /** Remove all triples matching a certain pattern from this graph.
> * If <code>subject</code>, <code>predicate</code> and/or
> * <code>object</code> are given, the triple must match these.
> * If any of the parameters is <code>null</code>,
> * any node will match it.
> */
> void remove(Object subject, Object predicate, Object object);
>
> void remove_A1A(Object predicate);
> void remove_1AA(Object subject);
> ...
>
> /** Replace all triples with the given predicate and object
> * with the given triple.
> */
> void setSubject(Object subject, Object predicate, Object object);
>
> /** Replace all triples with the given subject and predicate
> * with the given triple.
> */
> void setObject(Object subject, Object predicate, Object object);
>
> We don't have ``setPredicate()`` because it is essentially useless
> and potentially harmful-- someone using it almost certainly
> intended to do something else.
You're not marking exactly what the **diff** to current practice
is here, and why.
> This is never a problem because the ``setXXX()`` methods
> are only a convenience. You can always do::
>
> graph.remove(_, predicate, _);
> graph.add(subject, predicate, object);
>
> if you *do* happen to have some esoteric use for it.
>
>
> Conclusion
> ==========
>
> I believe this API will be substantially simpler to use
> than the one we have at the moment, and not lose
> anything w.r.t. speed. In fact, it may speed things up
> in the future, because we can cache the ``Triples`` objects.
Tuomas
- [Fenfire-dev] PEG swamp_easier--benja: An easier API for Swamp, Benja Fallenstein, 2003/09/21
- Re: [Fenfire-dev] PEG swamp_easier--benja: An easier API for Swamp, Matti Katila, 2003/09/21
- Re: [Fenfire-dev] PEG swamp_easier--benja: An easier API for Swamp,
Tuomas Lukka <=
- Re: [Fenfire-dev] PEG swamp_easier--benja: An easier API for Swamp, Benja Fallenstein, 2003/09/22
- Re: [Fenfire-dev] PEG swamp_easier--benja: An easier API for Swamp, Tuomas Lukka, 2003/09/22
- Re: [Fenfire-dev] PEG swamp_easier--benja: An easier API for Swamp, Benja Fallenstein, 2003/09/22
- Re: [Fenfire-dev] PEG swamp_easier--benja: An easier API for Swamp, Tuomas Lukka, 2003/09/22
- Re: [Fenfire-dev] PEG swamp_easier--benja: An easier API for Swamp, Benja Fallenstein, 2003/09/22
- Re: [Fenfire-dev] PEG swamp_easier--benja: An easier API for Swamp, Tuomas Lukka, 2003/09/22
- Re: [Fenfire-dev] PEG swamp_easier--benja: An easier API for Swamp, Benja Fallenstein, 2003/09/22
- Re: [Fenfire-dev] PEG swamp_easier--benja: An easier API for Swamp, Tuomas Lukka, 2003/09/22
- Re: [Fenfire-dev] PEG swamp_easier--benja: An easier API for Swamp, Benja Fallenstein, 2003/09/22
- Re: [Fenfire-dev] PEG swamp_easier--benja: An easier API for Swamp, Tuomas Lukka, 2003/09/22