Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Java > SAX Parser problem

Reply
Thread Tools

SAX Parser problem

 
 
Mize-ze
Guest
Posts: n/a
 
      11-13-2006
I am using SAX to parse an XML file.
I want to get the "characters" of a specific tag (element)

Right now I extend the DefaultHandler and override public void
characters(char[] ch, int start, int length) method. but this event is
raised whenever there is content in a tag.
How can I get a specific charcters from an element using SAX

I don't have access to the qName from this event.
Any ideas?

Thanks.

 
Reply With Quote
 
 
 
 
=?ISO-8859-1?Q?Arne_Vajh=F8j?=
Guest
Posts: n/a
 
      11-14-2006
Mize-ze wrote:
> I am using SAX to parse an XML file.
> I want to get the "characters" of a specific tag (element)
>
> Right now I extend the DefaultHandler and override public void
> characters(char[] ch, int start, int length) method. but this event is
> raised whenever there is content in a tag.
> How can I get a specific charcters from an element using SAX
>
> I don't have access to the qName from this event.


Override:

public void startElement(
String namespaceURI,
String localName,
String rawName,
Attributes atts)
throws SAXException {

Arne
 
Reply With Quote
 
 
 
 
Mize-ze
Guest
Posts: n/a
 
      11-16-2006

Arne Vajh°j wrote:
> Mize-ze wrote:
> > I am using SAX to parse an XML file.
> > I want to get the "characters" of a specific tag (element)
> >
> > Right now I extend the DefaultHandler and override public void
> > characters(char[] ch, int start, int length) method. but this event is
> > raised whenever there is content in a tag.
> > How can I get a specific charcters from an element using SAX
> >
> > I don't have access to the qName from this event.

>
> Override:
>
> public void startElement(
> String namespaceURI,
> String localName,
> String rawName,
> Attributes atts)
> throws SAXException {
>
> Arne



But where will I have access to the "characters"? (not to the atts)

<ELEMENT>charaters: this is what I want!!</ELEMENT>


thanks

 
Reply With Quote
 
Ian Wilson
Guest
Posts: n/a
 
      11-16-2006
Mize-ze wrote:
> Arne Vajh°j wrote:
>
>>Mize-ze wrote:
>>
>>>I am using SAX to parse an XML file.
>>>I want to get the "characters" of a specific tag (element)
>>>
>>>Right now I extend the DefaultHandler and override public void
>>>characters(char[] ch, int start, int length) method. but this event is
>>>raised whenever there is content in a tag.
>>>How can I get a specific charcters from an element using SAX
>>>
>>>I don't have access to the qName from this event.

>>
>>Override:
>>
>> public void startElement(
>> String namespaceURI,
>> String localName,
>> String rawName,
>> Attributes atts)
>> throws SAXException {
>>
>>Arne

>
>
>
> But where will I have access to the "characters"? (not to the atts)
>
> <ELEMENT>charaters: this is what I want!!</ELEMENT>
>


Here's a simple approach which I've used*:

In startElement(), store the localName (or qName). For example you could
store it in an instance variable (i.e. a field) such as String
currentElementName.

In characters() retrieve the stored localName (or qName). You then have
both tagname ("ELEMENT") and content ("charaters: this is what I
want!!") together in one place.

If necessary, you could nullify the stored localName (or qName) in
endElement().

* Actually I store a structure that represents all the elements leading
to a particular leaf in the XML tree

e.g. for
currentElement
<foo> foo
<bar> foo.bar
<baz>XXX</baz> foo.bar.baz
</bar>
</foo>
 
Reply With Quote
 
Donald Roby
Guest
Posts: n/a
 
      11-16-2006
Ian Wilson wrote:
>
> Here's a simple approach which I've used*:
>
> In startElement(), store the localName (or qName). For example you could
> store it in an instance variable (i.e. a field) such as String
> currentElementName.
>

In startElement(), also initialize a StringBuffer to collect the
characters into.

> In characters() retrieve the stored localName (or qName). You then have
> both tagname ("ELEMENT") and content ("charaters: this is what I
> want!!") together in one place.
>

You don't get them all at once necessarily. Collect them into the
above-mentioned StringBuffer in the characters() method for use elsewhere.

> If necessary, you could nullify the stored localName (or qName) in
> endElement().
>

In endElement(), convert the StringBuffer to a String and at this point,
you do have both the tag and the entire character contents.

At this point, I create whatever internal structure it is I'm building,
usually by a call to a separate builder that had been passed in via the
handler's constructor, using the tag and the extracted contents, and
then clear them out to be ready for the next one parsed.

 
Reply With Quote
 
Ian Wilson
Guest
Posts: n/a
 
      11-16-2006
Donald Roby wrote:
> Ian Wilson wrote:
>
>>
>> Here's a simple approach which I've used*:
>>
>> In startElement(), store the localName (or qName). For example you
>> could store it in an instance variable (i.e. a field) such as String
>> currentElementName.
>>

> In startElement(), also initialize a StringBuffer to collect the
> characters into.
>
>> In characters() retrieve the stored localName (or qName). You then
>> have both tagname ("ELEMENT") and content ("charaters: this is what I
>> want!!") together in one place.
>>

> You don't get them all at once necessarily. Collect them into the
> above-mentioned StringBuffer in the characters() method for use elsewhere.
>


Thanks for pointing that out!

On re-rereading the javadocs for DefaultHandler I now see that it refers
to "each chunk of character data", which is a clue I overlooked.

I'm not sure if my testing has been lucky or my XML is sufficiently
simple that the first "chunk" will always contain the whole character
data for that element.

Do you know of a simple XML example that illustrates character()
providing several chunks? Or is it some relatively unpredictable
buffering related phenomenon?

>> If necessary, you could nullify the stored localName (or qName) in
>> endElement().
>>

> In endElement(), convert the StringBuffer to a String and at this point,
> you do have both the tag and the entire character contents.


Noted
 
Reply With Quote
 
Ian Wilson
Guest
Posts: n/a
 
      11-16-2006
Ian Wilson wrote:
> Donald Roby wrote:
>> Ian Wilson wrote:
>>
>>> In characters() retrieve the stored localName (or qName). You then
>>> have both tagname ("ELEMENT") and content ("charaters: this is what I
>>> want!!") together in one place.
>>>

>> You don't get them all at once necessarily. Collect them into the
>> above-mentioned StringBuffer in the characters() method for use
>> elsewhere.
>>

>
> Do you know of a simple XML example that illustrates character()
> providing several chunks? Or is it some relatively unpredictable
> buffering related phenomenon?
>


It seems to happen if the character data contains newlines.

<inventory>
<animal type="mammal">
<name>Fred</name>
<species>Hippo</species>
<weight units="Kg">1552</weight>
</animal>
<animal type="reptile">
<name>
Gert
AKA Gertrude
the galloping reptile
</name>
<species>Croc</species>
</animal>
</inventory>

I find character() is called separately for "Gert", "AKA Gertrude" and
"the galloping reptile".

My XML data has no newlines within character data, so I didn't have a
problem. Nevertheless I have made the necessary changes just in case
 
Reply With Quote
 
=?ISO-8859-1?Q?Arne_Vajh=F8j?=
Guest
Posts: n/a
 
      11-17-2006
Mize-ze wrote:
> Arne Vajh°j wrote:
>> Mize-ze wrote:
>>> I am using SAX to parse an XML file.
>>> I want to get the "characters" of a specific tag (element)
>>>
>>> Right now I extend the DefaultHandler and override public void
>>> characters(char[] ch, int start, int length) method. but this event is
>>> raised whenever there is content in a tag.
>>> How can I get a specific charcters from an element using SAX
>>>
>>> I don't have access to the qName from this event.

>> Override:
>>
>> public void startElement(
>> String namespaceURI,
>> String localName,
>> String rawName,
>> Attributes atts)
>> throws SAXException {
>>
>> Arne

>
>
> But where will I have access to the "characters"? (not to the atts)


You find the tag with startElement and the text inside with characters.

Arne
 
Reply With Quote
 
vahan
Guest
Posts: n/a
 
      11-17-2006
In handle class:

String localName =null;

public void startElement(String uri, String localName,
String qName, Attributes attributes)
throws
SAXException {


this.localName = localName;
}
}

public void endElement(String uri,
String localName,
String qName) throws SAXException {
this.localName = null;

}


public void characters(char ch[], int start, int length) throws
SAXException {
if ("YourTagName".equalsIgnoreCase(localName)) {
String desiredContext =new String(ch, start,
length));
}
}











Arne Vajh°j wrote:
> Mize-ze wrote:
> > Arne Vajh°j wrote:
> >> Mize-ze wrote:
> >>> I am using SAX to parse an XML file.
> >>> I want to get the "characters" of a specific tag (element)
> >>>
> >>> Right now I extend the DefaultHandler and override public void
> >>> characters(char[] ch, int start, int length) method. but this event is
> >>> raised whenever there is content in a tag.
> >>> How can I get a specific charcters from an element using SAX
> >>>
> >>> I don't have access to the qName from this event.
> >> Override:
> >>
> >> public void startElement(
> >> String namespaceURI,
> >> String localName,
> >> String rawName,
> >> Attributes atts)
> >> throws SAXException {
> >>
> >> Arne

> >
> >
> > But where will I have access to the "characters"? (not to the atts)

>
> You find the tag with startElement and the text inside with characters.
>
> Arne


 
Reply With Quote
 
Ian Wilson
Guest
Posts: n/a
 
      11-17-2006
vahan wrote:

<top-posted example code snipped>

You're making the same mistake I did, see earlier in thread.

For one element, character() may be called several times providing
character data in several chunks per element.
 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
SAX parser problem steve_marjoribanks@hotmail.com Java 3 02-19-2006 09:08 PM
Sax Parser problem : xml encoding of string?? brightoceanlight@hotmail.com Java 5 09-15-2005 03:58 PM
Compile problem using SAX Parser and ant tool Mladen Adamovic Java 0 01-14-2005 06:19 PM
Encoding problem with SAX parser Martin Schlatter Java 2 12-14-2003 10:33 AM



Advertisments