Velocity Reviews

Velocity Reviews (http://www.velocityreviews.com/forums/index.php)
-   XML (http://www.velocityreviews.com/forums/f32-xml.html)
-   -   Is it possible with xerces ? (http://www.velocityreviews.com/forums/t166669-is-it-possible-with-xerces.html)

Manuel Yguel 02-18-2004 09:10 AM

Is it possible with xerces ?
 
I try to parse an indented xml file with dom xerces c++.
the file is like that :
<root>
<child1>
<field1> foo </field1>
<field2> bar </field2>
</child1>
<child2>
<field1> foo </field1>
<field2> bar </field2>
</child2>
</root>

where return an white spaces are in the xml file. So the program I
writed with dom give me this tree :
root has five childs :
text-node child1 text-node child2 text-node

the text of the first text-node is "\n "
the text of the second text-node is "\n "
the text of the third text-node is "\n"

these text-node of spaces occurs at each step in the tree hierarchy.

Is it possible to strip these nodes automatically ?

XML standard question : does this xml code respects the xml standard ?

<child2> some text
<field1> foo </field1>
<field2> bar </field2>
</child2>

"some text" is in the same depth of field1 and field2 but is a text. So
there is a soap of text and element. I thougth that the text must be a
leaf of the tree ... So does it respects the standard ?

Thanks


Philippe Poulard 02-18-2004 09:49 AM

Re: Is it possible with xerces ?
 
Manuel Yguel wrote:
> I try to parse an indented xml file with dom xerces c++.
> the file is like that :
> <root>
> <child1>
> <field1> foo </field1>
> <field2> bar </field2>
> </child1>
> <child2>
> <field1> foo </field1>
> <field2> bar </field2>
> </child2>
> </root>
>
> where return an white spaces are in the xml file. So the program I
> writed with dom give me this tree :
> root has five childs :
> text-node child1 text-node child2 text-node
>
> the text of the first text-node is "\n "
> the text of the second text-node is "\n "
> the text of the third text-node is "\n"
>
> these text-node of spaces occurs at each step in the tree hierarchy.
>
> Is it possible to strip these nodes automatically ?


yes : there is an option that allows to strip ignorable whitespaces, but
you must give a grammar that defines where are ignorable whitespaces,
like this :

<!ELEMENT root (child1,child2)>

>
> XML standard question : does this xml code respects the xml standard ?
>
> <child2> some text
> <field1> foo </field1>
> <field2> bar </field2>
> </child2>
>
> "some text" is in the same depth of field1 and field2 but is a text. So
> there is a soap of text and element. I thougth that the text must be a
> leaf of the tree ... So does it respects the standard ?


yes : an element may contain :
-nothing (empty element)
-subelements
-text
-text and subelements

>
> Thanks
>



--
Cordialement,

///
(. .)
-----ooO--(_)--Ooo-----
| Philippe Poulard |
-----------------------

Manuel Yguel 02-23-2004 07:05 PM

Re: Is it possible with xerces ?
 
Philippe Poulard wrote:
> Manuel Yguel wrote:
>
>> I try to parse an indented xml file with dom xerces c++.
>> the file is like that :
>> <root>
>> <child1>
>> <field1> foo </field1>
>> <field2> bar </field2>
>> </child1>
>> <child2>
>> <field1> foo </field1>
>> <field2> bar </field2>
>> </child2>
>> </root>
>>
>> where return an white spaces are in the xml file. So the program I
>> writed with dom give me this tree :
>> root has five childs :
>> text-node child1 text-node child2 text-node
>>
>> the text of the first text-node is "\n "
>> the text of the second text-node is "\n "
>> the text of the third text-node is "\n"
>>
>> these text-node of spaces occurs at each step in the tree hierarchy.
>>
>> Is it possible to strip these nodes automatically ?

>
>
> yes : there is an option that allows to strip ignorable whitespaces, but
> you must give a grammar that defines where are ignorable whitespaces,
> like this :
>
> <!ELEMENT root (child1,child2)>
>

thanks, but after how do you use the grammar with the parser ?
>>
>> XML standard question : does this xml code respects the xml standard ?
>>
>> <child2> some text
>> <field1> foo </field1>
>> <field2> bar </field2>
>> </child2>
>>
>> "some text" is in the same depth of field1 and field2 but is a text.
>> So there is a soap of text and element. I thougth that the text must
>> be a leaf of the tree ... So does it respects the standard ?

>
>
> yes : an element may contain :
> -nothing (empty element)
> -subelements
> -text
> -text and subelements
>
>>
>> Thanks
>>

>
>



Philippe Poulard 02-24-2004 08:17 AM

Re: Is it possible with xerces ?
 
Manuel Yguel wrote:
> Philippe Poulard wrote:
>
>> Manuel Yguel wrote:
>>
>>> I try to parse an indented xml file with dom xerces c++.
>>> the file is like that :
>>> <root>
>>> <child1>
>>> <field1> foo </field1>
>>> <field2> bar </field2>
>>> </child1>
>>> <child2>
>>> <field1> foo </field1>
>>> <field2> bar </field2>
>>> </child2>
>>> </root>
>>>
>>> where return an white spaces are in the xml file. So the program I
>>> writed with dom give me this tree :
>>> root has five childs :
>>> text-node child1 text-node child2 text-node
>>>
>>> the text of the first text-node is "\n "
>>> the text of the second text-node is "\n "
>>> the text of the third text-node is "\n"
>>>
>>> these text-node of spaces occurs at each step in the tree hierarchy.
>>>
>>> Is it possible to strip these nodes automatically ?

>>
>>
>>
>> yes : there is an option that allows to strip ignorable whitespaces,
>> but you must give a grammar that defines where are ignorable
>> whitespaces, like this :
>>
>> <!ELEMENT root (child1,child2)>
>>

> thanks, but after how do you use the grammar with the parser ?
>


use the <!DOCTYPE> declaration
you should have a look at the spec
--
Cordialement,

///
(. .)
-----ooO--(_)--Ooo-----
| Philippe Poulard |
-----------------------


All times are GMT. The time now is 02:40 PM.

Powered by vBulletin®. Copyright ©2000 - 2013, vBulletin Solutions, Inc.
SEO by vBSEO ©2010, Crawlability, Inc.


1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57