[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
RE: FW: [DotGNU]System.Xml
From: |
Simon Guindon |
Subject: |
RE: FW: [DotGNU]System.Xml |
Date: |
Wed, 8 Jan 2003 15:09:34 -0500 |
Well thats a good quick run through of XML basics hehe but my problem is
when a node has innertext IE not XML but TEXT for Example:
<node>Test</node>
Test is being added as a child node of node.
I'm told this is how its supposed to work, but I know for a fact its
incorrect to be able to access text nodes in the node.ChildNodes collection.
So I have no idea whats correct, thats why I'm asking, but I know it
shouldn't be accessible there.
Are text supposed to be child nodes if a node? If so, how do we make it
invisible to the ChildNodes collection. If not, do we just make InnerText a
simple string accessor?
Thanks,
Simon
-----------------------------
Simon Guindon
Nureality Networks
www.nureality.ca
-----Original Message-----
From: Chris Smith [mailto:address@hidden
Sent: January 8, 2003 1:46 PM
To: address@hidden
Subject: Re: FW: [DotGNU]System.Xml
On Wednesday 08 January 2003 18:02, you wrote:
> Ok we got the basic DOM structure created now, I have a slight problem
> though, TEXT nodes are being added to a nodes .ChildNodes array when you
do
> a .InnerText = somestring
>
> Is this the desired result? I know from extensive use of the .NET
> System.Xml that text does not show up as nodes in the .ChildNodes
> collection. Currently our nodes hold text, and link to each other in the
> document, all seems to be well, except that when you add body to a tag, IE
> <node>Test</node> Test ends up being a child node of node. I know people
> keep saying don't use .NET as a reference, but I truly believe something
is
> wrong here.
FWIW My experience of xml parsing is generally that an entity exists as a
child node of a parent, and may have associated data. This child node may
be
a parent itself. Attributes are the same but cannot be parents themselves.
eg:
<doc>
<node_a>
Some text with node A
<node_b attr="Some attribute text">
Node B has some too
<node_c>This is a leaf</node_c>
</node_b>
</node_a>
</doc>
As a tree becomes:
+ doc Branch Node
+ node_a - [Some text with node A] Branch Node
+ node_b - [Node B has some too] Branch Node
+ attr - [Some attribute text] Leaf Node
+ node_c - [This is a leaf] Leaf Node
So any node can have data associated with it. However, only entity nodes
can
have data and children. Attribute nodes may only have data (ie they are
always leaf nodes).
Now actually representing this so that it is parsable can be a pain, and you
sometimes need to introduce a 'false' value node so only LEAF nodes contain
data, branch nodes are just branch nodes.
+ node_a Branch
+ value - Some text with node A Leaf
+ node_b Branch
+ value - Node B has some too Leaf
+ node_c Branch
+
etc. But this is just a design thing, making tree building and parsing
easier (most of the time) by enforcing the simple rule of 'Only a leaf may
have data'.
And now I've forgotton what your original question was :o)
<re reads....> It sounds like you've ended up with the latter case, at which
I'm not surprised - it's the more natural representation. TBH it doesn't
matter how it does it on the inside, so long as it is compliant on the
outside. However, XML parsing is the thorn in the side of many systems with
regards to poor performance, so if it were actually quite efficient
internally, then cool!
Don't know if any of that helped....
Cheers
Chris
--
Chris Smith
Technical Architect - netFluid Technology Ltd.
"Internet Technologies, Distributed Systems and Tuxedo Consultancy"
E: address@hidden W: http://www.nfluid.co.uk
_______________________________________________
Developers mailing list
address@hidden
http://www.dotgnu.org/mailman/listinfo/developers