[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Fenfire-dev] PEG swamp_easier--benja: An easier API for Swamp
From: |
Tuomas Lukka |
Subject: |
Re: [Fenfire-dev] PEG swamp_easier--benja: An easier API for Swamp |
Date: |
Mon, 22 Sep 2003 13:10:38 +0300 |
User-agent: |
Mutt/1.5.4i |
On Mon, Sep 22, 2003 at 12:54:32PM +0300, Benja Fallenstein wrote:
> Tuomas Lukka wrote:
> >On Mon, Sep 22, 2003 at 05:09:44AM +0300, Benja Fallenstein wrote:
> >> for(Triples t = graph.get(_, RDF.type, _); t.loop();) {
> >> System.out.println(t.sub+" is instance of "+t.ob);
> >> }
> >
> >This is nice.
> >
> >ISSUE: Name for that call: get(...)? We have find() so far.
>
> Hm. I've always advocated get ;-) ;-)
>
> I've done some googling-- e.g. Aaron Swartz' Python API uses query(...)
> (with similar semantics). The thing I don't like about find() and
> query() is mostly psychological: they seem to indicate a little effort,
> whereas get(...) sounds like something that's essentially free. But
> that's only a mild objection to find(), not a strong one.
>
> What do you think?
I feel better about find(), since it
1) feels lighter than query
2) feels heavier than get, as it should - we don't *necessarily*
have all indices ready.
And it's consistent with what code is there already. If there's
a change, change all the occurrences.
> >ISSUE: Name for the iterator-like thing that goes through triples.
> >"Triples" says it contains several triples while it has only one
> >at a time. "TripleIterator", "TripleIter", ...?
>
> I wanted it to be short, of course, but I guess you have a point.
> ``TripleIter`` should be fine... ::
>
> for(TripleIter i = graph.get(_, RDF.type, _); i.loop();) {
> System.out.println(i.subj+" instance of "+i.obj);
> }
>
> I still prefer ``Triples``, but I'm willing to settle for ``TripleIter``.
I'd prefer Iter, as it says what it is.
> >>However, to be fair, my code isn't how it would look
> >>when efficiency is at a premium. (Then again, when I print
> >>to the console inside the loop, efficiency isn't at a
> >>premium anyway... but whatever...) The *fast* version
> >>would look like this::
> >
> >Umm, you should note here that the efficiency difference is in the call,
> >not in the actual code, as get() can be just a set of if clauses
>
> True.
>
> >and actually I think that hotspot might be able to handle it.
>
> I earlier suggested that and you were suspicious of it ;-) I do agree--
> it's essentially three ``jnz``s per ``get()``, very cheap. I can say
> this in the PEG.
Three jnz's and a method call.
> >However, there's another performance difference with the Triples objects
> >which you haven't mentioned: *all* members need to be fetched each
> >time.
>
> Not exactly true: Only the members which change need to be. E.g., if you
> have ::
>
> get(_, RDF.type, _)
>
> only the subject and object need to be loaded each time.
>
> And most of the time if you do such a query you would want to use both
> of them. So it would only cost extra if you do such a query, but do
> *not* use both subject and object.
Issue: Should you be able to query just subjects, i.e. ignoring objects,
having them null in the triples and not getting duplicates?
> Still, can note it in the PEG. -- Or maybe we *should* have::
>
> Object subj(), pred(), obj();
>
> These are also nicer because they can give error messages when
> ``next()`` hasn't been called yet. Opinions?
Hmm, could you test what the performances are then? How well
is hotspot able to get that if?
> >> for(Triples t = graph.get_A1A(RDF.type); t.loop()) {
> >> System.out.println(t.sub+" is instance of "+t.ob);
> >> }
> >
> >Note: missing a semicolon.
> >
> >ISSUE: Naming. I'd think find_X1X_Triples would make more sense here.
>
> find...Triples: Any particular reason?
we have find_..._Iter, it would be easiest to put the return type
there and once we have used swamp for several years and *know* the
best solution, we'll take that as the return type.
The point is that you can't overload just by return type.
> >> Object getSubject(Object subject, Object predicate, Object object);
> >>
> >> Object getSubject_A1A(Object predicate);
> >> ...
> >
> >ISSUE: If there is more than one?
>
> Clarified on IRC: The issue is what happens if there is more than one
> matching triple.
>
> The current way is to throw NotUniqueException.
The javadoc didn't say that.
> There's a problem with that: Basically always when a property has
> cardinality one, there can still be two nodes in the graph, e.g.::
>
> x:foo ex:homeCountry y:bar .
> x:foo ex:homeCountry z:baz .
>
> because ``x:bar`` and ``x:baz`` may represent the same resource. (You
> cannot require global agreement on the one URI to be used for every
> particular thing in the world.)
>
> So signalling an error isn't necessarily correct.
>
> Jena returns just an arbitrary one of the matching triples in a similar
> situation; I'm leaning towards that.
I'd *really* hate that one -- I'd prefer swamp to have totally clear
semantics, with the only arbitrary thing being the order in which a set
is iterated through.
> >>The iterator-like object, ``Triples``, shall have
> >>the following API::
> >>
> >> Object sub, pred, ob;
> >
> >Issue: Names. subj, pred, obj would be more consistent, i.e.
> >up to the *end* of the second consonant group.
>
> Yes, but these are also impossible to pronounce... "SUB-djjjj"
>
> Sub, pred, ob are the shortest abbreviations that have a chance to get
> understood, so they're consistent in a sense, too. ;-) I.e., "su" or
> "pre" would be misleading/not understood.
"s", "p", "o"?
I've seen subj used elsewhere as an abbrev. to subject, but never sub - it's a
prefix,
as is ob.
> >>The purpose of ``loop()`` is to enable the common loop
> >>pattern, ::
> >>
> >> for(Triples t = graph.get(...); t.loop();) {
> >> // ...
> >> }
> >>
> >>which would otherwise have to be written as::
> >>
> >> Triples t;
> >> for(t = graph.get(...); t.hasNext(); t.next()) {
> >> // ...
> >> }
> >> t.free();
> >
> >This should go into the javadoc.
>
> Sure, but for the PEG I found it easier to read in the body, and the
> javadoc is in the PEG for clarification of the PEG, no?
Ok.
> The examples should go into the *class*'s javadoc actually, I think.
Exactly what I meant.
Tuomas
- [Fenfire-dev] PEG swamp_easier--benja: An easier API for Swamp, Benja Fallenstein, 2003/09/21
- Re: [Fenfire-dev] PEG swamp_easier--benja: An easier API for Swamp, Matti Katila, 2003/09/21
- Re: [Fenfire-dev] PEG swamp_easier--benja: An easier API for Swamp, Tuomas Lukka, 2003/09/22
- Re: [Fenfire-dev] PEG swamp_easier--benja: An easier API for Swamp, Benja Fallenstein, 2003/09/22
- Re: [Fenfire-dev] PEG swamp_easier--benja: An easier API for Swamp,
Tuomas Lukka <=
- Re: [Fenfire-dev] PEG swamp_easier--benja: An easier API for Swamp, Benja Fallenstein, 2003/09/22
- Re: [Fenfire-dev] PEG swamp_easier--benja: An easier API for Swamp, Tuomas Lukka, 2003/09/22
- Re: [Fenfire-dev] PEG swamp_easier--benja: An easier API for Swamp, Benja Fallenstein, 2003/09/22
- Re: [Fenfire-dev] PEG swamp_easier--benja: An easier API for Swamp, Tuomas Lukka, 2003/09/22
- Re: [Fenfire-dev] PEG swamp_easier--benja: An easier API for Swamp, Benja Fallenstein, 2003/09/22
- Re: [Fenfire-dev] PEG swamp_easier--benja: An easier API for Swamp, Tuomas Lukka, 2003/09/22
- Re: [Fenfire-dev] PEG swamp_easier--benja: An easier API for Swamp, Benja Fallenstein, 2003/09/22
- Re: [Fenfire-dev] PEG swamp_easier--benja: An easier API for Swamp, Tuomas Lukka, 2003/09/22
- Re: [Fenfire-dev] PEG swamp_easier--benja: An easier API for Swamp, Benja Fallenstein, 2003/09/22
- Re: [Fenfire-dev] PEG swamp_easier--benja: An easier API for Swamp, Tuomas Lukka, 2003/09/22